|
|
||||||||
a Department of Agronomy, Purdue University, West Lafayette, IN 47905-1150 USA
bernardo{at}purdue.edu
| ABSTRACT |
|---|
|
|
|---|
Abbreviations: cM, centiMorgans QTL, quantitative trait locus or loci T-BLUP, best linear unbiased prediction based on trait data TM-BLUP, best linear unbiased prediction based on trait and marker data
| INTRODUCTION |
|---|
|
|
|---|
In maize (Zea mays L.), many studies have identified molecular markers associated with grain yield (Edwards et al., 1987; Stuber et al., 1992; Zehr et al., 1992; Beavis et al., 1994; Veldboom and Lee, 1994; Ajmone-Marsan et al., 1995; Austin and Lee, 1996; Eathington et al., 1997; Melchinger et al., 1998), disease resistance (Bubeck et al., 1993; Freymark et al., 1993), insect tolerance (Schön et al., 1993), kernel chemical composition (Goldman et al., 1993; Schön et al., 1994), and morphological traits (Beavis et al., 1991; Koester et al., 1993; Schön et al., 1994; Veldboom et al., 1994). But despite the detection of QTL for such traits, marker-assisted selection based on regression of trait means on marker genotypes (Lande and Thompson, 1990) has not been widely used in maize breeding (Smith and Beavis, 1996). Genotype x environment interaction causes QTL effects to vary across environments (Beavis and Keim, 1995). The linkage phase between a marker locus and a QTL may vary among inbreds, thereby limiting the estimates of marker-associated effects to the mapping population studied. The resulting inconsistency in means of marker genotypes among mapping populations and environments causes difficulty in improving traits by selecting for desirable marker alleles in any given breeding population.
Best linear unbiased prediction based on trait and marker data (TM-BLUP) is an alternative approach that may be feasible in maize breeding (Bernardo, 1998). In TM-BLUP, the identity by descent of unobservable QTL alleles is inferred from the observed genotypes at marker loci that flank the QTL. The TM-BLUP approach therefore models the covariances, but not the means, associated with QTL. Neither linkage disequilibrium between marker loci and QTL, nor information on the mean effect associated with a particular marker allele, is needed in TM-BLUP (Wang et al., 1995). The TM-BLUP procedure requires information on (i) the recombination frequencies between a QTL and its flanking markers and (ii) QTL variances. The recombination frequencies can be estimated from large mapping populations evaluated in a large number of environments (Beavis, 1994). Assuming the recombination frequencies are consistent across populations and environments, the QTL variances in other populations and environments could be subsequently estimated from phenotypic data routinely generated in breeding programs.
Can knowledge of markerQTL associations help breeders identify superior single crosses prior to field testing? The objective of this simulation study was to compare the effectiveness of T-BLUP and TM-BLUP for predicting single-cross performance.
| Theory and methods |
|---|
|
|
|---|
The simulations mimicked the following conditions relevant to maize breeding: (i) heterotic groups are complementary; (ii) elite inbreds within heterotic groups are crossed to obtain new recombinant inbreds; (iii) about 15% of the possible single-cross combinations are evaluated in field trials; and (iv) dominance effects for grain yield are strong, but the ratio of dominance variance to total genetic variance is often
0.20 (Bernardo, 1996a). Other assumptions were that the QTL variances were known, the QTL were unlinked with each other, and epistasis was absent.
Inbreds and Their Genotypes
Each heterotic group (denoted Group 1 and Group 2) comprised 55 inbreds. Four were founder inbreds, six were second-cycle recombinant inbreds, 15 were third-cycle recombinant inbreds, and 30 were fourth-cycle recombinant inbreds. The founder inbreds were unrelated within and between heterotic groups. One second-cycle inbred was randomly derived from the F2 generation of each of the six possible crosses among founder inbreds. One third-cycle inbred was randomly derived from the F2 generation of each of the 15 possible crosses among second-cycle inbreds. Two fourth-cycle inbreds were randomly derived from each of the F2 generations obtained by chain crossing the 15 third-cycle inbreds, i.e., 1 x 2, 2 x 3, 3 x 4, ..., and 15 x 1.
The n = 10, 50, or 100 QTL were unlinked with each other. The kth (k = 1- n) QTL, denoted Qk, was linked to two flanking marker loci (Mk and Nk). The Qk locus was located exactly midway between Mk and Nk. The map distance between Qk and either flanking marker was constant across all n QTL, and was equal to 2.5 or 10 cM.
There were four alleles (+, +', -, and -') at each Qk locus. At any given locus, alleles + and - were found only in one heterotic group, whereas +' and -' were found only in the other heterotic group. Group 1 had the + and - alleles at odd-numbered QTL, and the +' and -' alleles at even-numbered QTL. In contrast, Group 2 had the + and - alleles at even-numbered QTL, and the +' and -' alleles at odd-numbered QTL. The founder inbreds in each heterotic group were randomly derived from a conceptual base population wherein the two alleles at Qk had frequencies of 1/2 across all n QTL. Each founder inbred differed in its marker allele at both Mk and Nk.
I simulated recombination between Mk and Qk and between Qk and Nk (i.e., for k = 1 - n) during the development of second-, third-, and fourth-cycle inbreds. The recombinant inbreds were derived at random according to the expected frequencies of parental types, single crossovers, and double crossovers among Mk, Qk, and Nk (Table 1) . With a single meiosis, the recombination frequency was r between Mk and Qk, as well as between Qk and Nk. The map distances were transformed into r with Haldane's mapping function (Haldane, 1919). Among recombinant inbreds, the expected frequency of recombinants was R1 = 2r/(1 + 2r) between Mk and Qk, and R2 = R1 between Qk and Nk (Haldane and Waddington, 1931). Interference was absent. The frequency of recombinants between Mk and Nk was R = R1 + R2 - 2R1R2. Values of R, R1, and R2 were constant across all n QTL.
|
. The testcross additive variance in Group 2 (i.e., when crossed to Group 1) was
. At even-numbered QTL,
and
. The dominance variance at both odd- and even-numbered QTL was
. Summing across n QTL, the genetic variances were:
; and
. Total genetic variance was
. The resulting ratio of VD/VG was 0.167, which was close to the average ratio of 0.20 for maize grain yield when the number of single crosses was >250 (Bernardo, 1996a). The phenotypic value of a single cross was equal to its genetic value plus a random nongenetic effect. Nongenetic effects were normally and independently distributed with a mean of zero and variance, VE. The value of VE was calculated based on h2 = VG/(VG + VE) equal to 0.2, 0.4, 0.6, 0.8, or 1.
Tested and Untested Single Crosses
There were 3025 single crosses between the 55 inbreds in Group 1 and 55 inbreds in Group 2. A total of 500 single crosses were randomly chosen as tested hybrids, whereas the remaining 2525 single crosses were considered untested (but had known genetic values). The performance of the 2525 untested single crosses was predicted from the phenotypic values of the 500 tested single crosses by T-BLUP and TM-BLUP.
Covariance between Single Crosses
Assume i and j are inbreds from Group 1, whereas i' and j' are inbreds from Group 2. I assumed different proportions of QTL (p = 0.1, 0.3, 0.6, or 1) had known linkage to markers for the purpose of comparing T-BLUP and TM-BLUP. Different combinations of n and p resulted in different proportions of VG that were accounted for by the marked QTL (Table 2)
.
|
![]() | (1) |
In TM-BLUP, information on markerQTL linkage was known for the first pn QTL, which had the largest effects. The markerQTL linkages were assumed unknown at the remaining (1 - p)n QTL. The genotypes at Qk were assumed unknown and were inferred from the genotypes at Mk and Nk. Definition of alleles according to their origin becomes necessary. The alleles homozygous in a given inbred are denoted in superscript, i.e., inbred i is homozygous for the alleles Mik, Qik, and Nik. The conditional covariance between single crosses, given the observed marker genotypes
at Mk and Nk, was (Bernardo, 1998):
|
| (2) |
given Gkobs, and
given Gkobs. The calculation of Pr
and Pr
is outlined below.
Probability of Descent of a Marked QTL Allele
Suppose a and b are the parental inbreds of i, and j is not a direct descendant of i. The Pr
term can be expressed in terms of the conditional probability that Qjk
Qak and Qjk
Qbk given Gkobs (Bernardo, 1998):
![]() | (3) |
![]() |
![]() |
The conditional probability that Qik
Qab, given Gkobs, was obtained from the expected frequencies of parental types, single crossovers, and double crossovers among Mk, Qk, and Nk (Table 1):
![]() | (4) |
![]() |
![]() |
![]() |
![]() |
The values of Pr
, Pr
, Pr
, and Pr
were determined from the marker genotypes, and were equal to 1 or 0 if Mik or Nik was polymorphic between a and b. Suppose a has the +/+ genotype, b has the -/- genotype, and i has the +/+ genotype at Mk. In this example,
and
. But if a and b both have the +/+ genotype, the values of Pr
and Pr
are simply determined from the parental contributions to inbred progeny, i.e., the proportion of the genome derived directly by an inbred from each of its two parents. For example, if i is a BC1-derived inbred, a is the recurrent parent, b is the donor parent, and a and b have the +/+ genotype, then
and
. The inbreds in this study were derived from F2 populations. Thus,
when Mk was not polymorphic between a and b.
Because Qik must have descended from either Qak or Qbk, Pr
was calculated as
.
Prediction of Single-Cross Performance
Let yT be a 500 by 1 vector of phenotypic values of the 500 tested single crosses. The performance of the 2525 untested single crosses was predicted as
![]() | (5) |
Simple correlations between the predicted performance and true genetic values of the 2525 untested single crosses were calculated. The mean correlation across the 50 repeats was calculated. Pairwise two-sided significance tests (P = 0.05) among the mean correlations were done by standard bootstrapping (Efron, 1979). The significance tests were based on 5000 bootstrap samples, each sample comprising 50 paired correlations randomly selected with replacement from the 50 repeats.
| Results and discussion |
|---|
|
|
|---|
|
The reason for this lack of improvement can be deduced from the covariance between single crosses in T-BLUP (Eq. [1]) and in TM-BLUP (Eq. [2]) under simplifying assumptions. Assume p = 1 and that the probability that QTL alleles are identical by descent [i.e., Pr
and Pr
] is known at each locus. The covariance between single crosses in TM-BLUP becomes
. For recombinant inbreds derived at random, the expectation of
is equal to fij. Likewise, the expectation of
is equal to fi'j'. The expectations of the covariance between Pr
and VkA
, between Pr
and VkA
, and between Pr
Pr
and VkD are equal to zero for recombinant inbreds derived at random. In other words, whether or not two inbreds have alleles at a particular QTL that are identical by descent does not depend on the magnitude of the testcross additive or dominance variance at that QTL. Consequently, the covariance between single crosses in TM-BLUP becomes equivalent to
. This expression is equal to fijVA
+ fi'j'VA
+ fijfi'j'VD, i.e., the covariance between single crosses in T-BLUP. Therefore, the lack of improvement in the predictions with marker data was due to the identical expectations of the covariance between single-crosses in T-BLUP and TM-BLUP in the absence of selection.
When a trait is controlled by few QTL, the observed covariances between Pr
and VkA
, between Pr
and VkA
, and between Pr
Pr
and VkA
may deviate substantially from zero because of sampling variation. This phenomenon accounts for the greater advantage of TM-BLUP over T-BLUP when the trait is controlled by fewer loci. The loss of advantage of TM-BLUP over T-BLUP could apply not only to single crosses, but also to predicting breeding values of individuals within a population, as proposed by van Arendonk et al. (1994). Markers may be most useful for dissecting complex quantitative traits that are controlled by many, rather than few, QTL. It is ironic that TM-BLUP is most useful when few QTL control the trait or when genetic gain is sought only at a limited subset of QTL. The latter situation may be true for germplasm introgression programs, wherein inbreds are derived from backcross populations with an adapted inbred as the recurrent parent and an exotic population as the donor parent.
Perhaps TM-BLUP might also be more effective when recombinant inbreds are derived through selection rather than at random. Selection during inbreeding may cause deviations between the average identity by descent across QTL (i.e.,
and
and Malecot's coefficients of coancestry based on pedigree (fij and fi'j'). Selection may also cause nonzero covariances between Pr
and VkA
, between Pr
and VkA
, and between Pr
Pr
and VkD. Consequently, the covariance between single crosses would differ between T-BLUP and TM-BLUP. On the other hand, if n is large, the effect of selection at individual QTL may be too small to cause a difference in the covariance between single crosses with T-BLUP and TM-BLUP.
In this study, I assumed that (i) the genetic variances associated with QTL were known, (ii) the QTL were unlinked with each other, and (iii) epistasis was absent. In practice, errors in the estimation of VkA
, VkA
, VkD can only lead to further decreases in the effectiveness of TM-BLUP relative to T-BLUP. The assumption of unlinked QTL is unrealistic when many QTL exist in a finite genome. The effects of assuming unlinked QTL in TM-BLUP, when linkages among QTL exist, are unclear. If the QTL are linked with each other, Eq. [2] should be expanded to account for the covariance of effects at linked QTL, or might be used as an approximation. The effects of epistasis were assumed negligible for two reasons. First, epistatic variances have smaller contributions than VA(1), VA(2), and VD in the covariance between single crosses (Stuber and Cockerham, 1966). Secondly, epistatic variance is often small even when strong physiological epistasis is present. For example, epistatic variance comprises only 14% of VG with complementary gene action, i.e., 9:7 ratio in the F2 of a dihybrid cross.
The main conclusions from this study are that (i) T-BLUP is effective in predicting single-cross performance even with moderate h2 (i.e., 0.40.6), and (ii) molecular markers that flank QTL do not greatly improve the predictions, even when all QTL are tightly linked to flanking markers. Any advantage of TM-BLUP over T-BLUP decreases as the number of QTL increases. These results shed doubt on the usefulness of TM-BLUP in applied breeding programs, not only for predicting single-cross performance, but also for selecting individuals within plant or animal populations.
| ACKNOWLEDGMENTS |
|---|
| NOTES |
|---|
|
|
|---|
Received for publication October 16, 1998.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
A. R. Hallauer History, Contribution, and Future of Quantitative Genetics in Plant Breeding: Lessons From Maize Crop Sci., December 18, 2007; 47(Supplement_3): S-4 - S-19. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. M. Bauer, T. C. Reetz, and J. Leon Estimation of Breeding Values of Inbred Lines using Best Linear Unbiased Prediction (BLUP) and Genetic Similarities Crop Sci., November 21, 2006; 46(6): 2685 - 2691. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Bernardo What If We Knew All the Genes for a Quantitative Trait in Hybrid Crops? Crop Sci., January 1, 2001; 41(1): 1 - 4. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Bernardo Breeding Potential of Intra- and Interheterotic Group Crosses in Maize Crop Sci., January 1, 2001; 41(1): 68 - 71. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| The SCI Journals | Agronomy Journal | Vadose Zone Journal | |||
| Journal of Natural Resources and Life Sciences Education |
Soil Science Society of America Journal | ||||
| Journal of Plant Registrations | Journal of Environmental Quality |
The Plant Genome | |||