Crop Science Journal of Natural Resources and Life Sciences Education
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Published in Crop Sci 39:1277-1282 (1999)
© 1999 Crop Science Society of America
677 S. Segoe Rd., Madison, WI 53711 USA
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (11)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Bernardo, R.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Bernardo, R.
Agricola
Right arrow Articles by Bernardo, R.
Crop Science 39:1277-1282 (1999)
© 1999 Crop Science Society of America

CROP BREEDING, GENETICS & CYTOLOGY

Marker-Assisted Best Linear Unbiased Prediction of Single-Cross Performance

Rex Bernardoa

a Department of Agronomy, Purdue University, West Lafayette, IN 47905-1150 USA

bernardo{at}purdue.edu


    ABSTRACT
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 Theory and methods
 Results and discussion
 REFERENCES
 
Predicting the performance of untested single crosses is important in hybrid breeding programs. The objective of this study was to compare the effectiveness of best linear unbiased prediction based on trait data alone (T-BLUP) and trait and marker data combined (TM-BLUP). The simulation procedure involved creating founder and recombinant inbreds in each of two heterotic groups, determining genetic and phenotypic values of 3025 single crosses, randomly partitioning the single crosses into 500 tested and 2525 untested hybrids, and calculating the correlation between the true and predicted performance of untested single crosses. The T-BLUP correlations ranged from 0.74 to 0.84, with n = 10, 50, or 100 quantitative trait loci (QTL) and trait heritability of 0.4 or 0.6. The advantage of TM-BLUP over T-BLUP decreased as n increased. With n = 50 or 100, the TM-BLUP correlations exceeded the T-BLUP correlations by 0.00 to 0.03, even when all QTL were tightly linked to flanking markers. The usefulness of TM-BLUP is doubtful, not only for predicting single-cross performance, but also for predicting breeding values of individuals within populations. The TM-BLUP procedure is useful when few QTL control a trait, or when genetic gain is sought only at a limited subset of QTL.

Abbreviations: cM, centiMorgans • QTL, quantitative trait locus or loci • T-BLUP, best linear unbiased prediction based on trait data • TM-BLUP, best linear unbiased prediction based on trait and marker data


    INTRODUCTION
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 Theory and methods
 Results and discussion
 REFERENCES
 
SINGLE-CROSS CULTIVARS are made between inbreds from two complementary heterotic groups. Predicting the performance of untested single crosses is an important objective in hybrid breeding programs. In previous studies, I found best linear unbiased prediction based on trait data (T-BLUP) useful for identifying superior single crosses prior to field testing (Bernardo, 1996a, 1996b). In the T-BLUP procedure, predictions are made on the basis of known genetic relationships among parental inbreds and available performance data for related single crosses.

In maize (Zea mays L.), many studies have identified molecular markers associated with grain yield (Edwards et al., 1987; Stuber et al., 1992; Zehr et al., 1992; Beavis et al., 1994; Veldboom and Lee, 1994; Ajmone-Marsan et al., 1995; Austin and Lee, 1996; Eathington et al., 1997; Melchinger et al., 1998), disease resistance (Bubeck et al., 1993; Freymark et al., 1993), insect tolerance (Schön et al., 1993), kernel chemical composition (Goldman et al., 1993; Schön et al., 1994), and morphological traits (Beavis et al., 1991; Koester et al., 1993; Schön et al., 1994; Veldboom et al., 1994). But despite the detection of QTL for such traits, marker-assisted selection based on regression of trait means on marker genotypes (Lande and Thompson, 1990) has not been widely used in maize breeding (Smith and Beavis, 1996). Genotype x environment interaction causes QTL effects to vary across environments (Beavis and Keim, 1995). The linkage phase between a marker locus and a QTL may vary among inbreds, thereby limiting the estimates of marker-associated effects to the mapping population studied. The resulting inconsistency in means of marker genotypes among mapping populations and environments causes difficulty in improving traits by selecting for desirable marker alleles in any given breeding population.

Best linear unbiased prediction based on trait and marker data (TM-BLUP) is an alternative approach that may be feasible in maize breeding (Bernardo, 1998). In TM-BLUP, the identity by descent of unobservable QTL alleles is inferred from the observed genotypes at marker loci that flank the QTL. The TM-BLUP approach therefore models the covariances, but not the means, associated with QTL. Neither linkage disequilibrium between marker loci and QTL, nor information on the mean effect associated with a particular marker allele, is needed in TM-BLUP (Wang et al., 1995). The TM-BLUP procedure requires information on (i) the recombination frequencies between a QTL and its flanking markers and (ii) QTL variances. The recombination frequencies can be estimated from large mapping populations evaluated in a large number of environments (Beavis, 1994). Assuming the recombination frequencies are consistent across populations and environments, the QTL variances in other populations and environments could be subsequently estimated from phenotypic data routinely generated in breeding programs.

Can knowledge of marker–QTL associations help breeders identify superior single crosses prior to field testing? The objective of this simulation study was to compare the effectiveness of T-BLUP and TM-BLUP for predicting single-cross performance.


    Theory and methods
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 Theory and methods
 Results and discussion
 REFERENCES
 
I wrote a Fortran program to (i) simulate genotypes of founder inbreds and recombinant inbreds in each of two unrelated heterotic groups, (ii) simulate genetic and phenotypic values of single crosses, (iii) randomly partition the single crosses into tested and untested hybrids, and (iv) calculate the correlation between the true and predicted performance of untested single crosses for T-BLUP and TM-BLUP. The number of QTL was n = 10, 50, or 100. Trait heritability was h2 = 0.2, 0.4, 0.6, 0.8, or 1. Each of the n QTL was linked to one marker to its left and one marker to its right, i.e., only a single QTL between flanking markers. But I assumed that only a specific proportion (p) of these QTL–marker linkages were known (i.e., p = 0.1, 0.3, 0.6, 1) in TM-BLUP. The map distance between a QTL and its flanking markers was 2.5 or 10 centiMorgans (cM). Fifty repeats of the experiment (Steps i–iv above) were conducted for each combination of n and h2.

The simulations mimicked the following conditions relevant to maize breeding: (i) heterotic groups are complementary; (ii) elite inbreds within heterotic groups are crossed to obtain new recombinant inbreds; (iii) about 15% of the possible single-cross combinations are evaluated in field trials; and (iv) dominance effects for grain yield are strong, but the ratio of dominance variance to total genetic variance is often <=0.20 (Bernardo, 1996a). Other assumptions were that the QTL variances were known, the QTL were unlinked with each other, and epistasis was absent.

Inbreds and Their Genotypes
Each heterotic group (denoted Group 1 and Group 2) comprised 55 inbreds. Four were founder inbreds, six were second-cycle recombinant inbreds, 15 were third-cycle recombinant inbreds, and 30 were fourth-cycle recombinant inbreds. The founder inbreds were unrelated within and between heterotic groups. One second-cycle inbred was randomly derived from the F2 generation of each of the six possible crosses among founder inbreds. One third-cycle inbred was randomly derived from the F2 generation of each of the 15 possible crosses among second-cycle inbreds. Two fourth-cycle inbreds were randomly derived from each of the F2 generations obtained by chain crossing the 15 third-cycle inbreds, i.e., 1 x 2, 2 x 3, 3 x 4, ..., and 15 x 1.

The n = 10, 50, or 100 QTL were unlinked with each other. The kth (k = 1- n) QTL, denoted Qk, was linked to two flanking marker loci (Mk and Nk). The Qk locus was located exactly midway between Mk and Nk. The map distance between Qk and either flanking marker was constant across all n QTL, and was equal to 2.5 or 10 cM.

There were four alleles (+, +', -, and -') at each Qk locus. At any given locus, alleles + and - were found only in one heterotic group, whereas +' and -' were found only in the other heterotic group. Group 1 had the + and - alleles at odd-numbered QTL, and the +' and -' alleles at even-numbered QTL. In contrast, Group 2 had the + and - alleles at even-numbered QTL, and the +' and -' alleles at odd-numbered QTL. The founder inbreds in each heterotic group were randomly derived from a conceptual base population wherein the two alleles at Qk had frequencies of 1/2 across all n QTL. Each founder inbred differed in its marker allele at both Mk and Nk.

I simulated recombination between Mk and Qk and between Qk and Nk (i.e., for k = 1 - n) during the development of second-, third-, and fourth-cycle inbreds. The recombinant inbreds were derived at random according to the expected frequencies of parental types, single crossovers, and double crossovers among Mk, Qk, and Nk (Table 1) . With a single meiosis, the recombination frequency was r between Mk and Qk, as well as between Qk and Nk. The map distances were transformed into r with Haldane's mapping function (Haldane, 1919). Among recombinant inbreds, the expected frequency of recombinants was R1 = 2r/(1 + 2r) between Mk and Qk, and R2 = R1 between Qk and Nk (Haldane and Waddington, 1931). Interference was absent. The frequency of recombinants between Mk and Nk was R = R1 + R2 - 2R1R2. Values of R, R1, and R2 were constant across all n QTL.


View this table:
[in this window]
[in a new window]
 
Table 1 Expected genotypic frequencies among recombinant inbreds at a quantitative trait locus (Qk) flanked by two marker loci (Mk and Nk).{dagger}

 
Genotypic and Phenotypic Values of Single Crosses
The marker loci per se had no effects on the trait, whereas the effects of the QTL were exponential. The genotypic values of the homozygotes at the kth QTL were: 0.98k for (+/+)k, 1/2(0.98k) for (+'/+')k, -1/2(0.98k) for (-/-)k, and -(0.98k) for (-'/-')k. Complete dominance of the allele from the homozygote with the larger value was assumed. In a Group 1 x Group 2 single cross, the genotypic values of the four possible genotypes at Qk were: 0.98k for (+/+')k and (+/-')k, 1/2(0.98k) for (-/+')k, and -1/2(0.98k) for (-/-')k. Epistasis was absent. The genetic value of each single cross was equal to the sum of the genotypic effects across the n QTL. At odd-numbered QTL, the testcross additive variance in Group 1 (i.e., when crossed to Group 2) was . The testcross additive variance in Group 2 (i.e., when crossed to Group 1) was . At even-numbered QTL, and . The dominance variance at both odd- and even-numbered QTL was . Summing across n QTL, the genetic variances were: ; and . Total genetic variance was . The resulting ratio of VD/VG was 0.167, which was close to the average ratio of 0.20 for maize grain yield when the number of single crosses was >250 (Bernardo, 1996a).

The phenotypic value of a single cross was equal to its genetic value plus a random nongenetic effect. Nongenetic effects were normally and independently distributed with a mean of zero and variance, VE. The value of VE was calculated based on h2 = VG/(VG + VE) equal to 0.2, 0.4, 0.6, 0.8, or 1.

Tested and Untested Single Crosses
There were 3025 single crosses between the 55 inbreds in Group 1 and 55 inbreds in Group 2. A total of 500 single crosses were randomly chosen as tested hybrids, whereas the remaining 2525 single crosses were considered untested (but had known genetic values). The performance of the 2525 untested single crosses was predicted from the phenotypic values of the 500 tested single crosses by T-BLUP and TM-BLUP.

Covariance between Single Crosses
Assume i and j are inbreds from Group 1, whereas i' and j' are inbreds from Group 2. I assumed different proportions of QTL (p = 0.1, 0.3, 0.6, or 1) had known linkage to markers for the purpose of comparing T-BLUP and TM-BLUP. Different combinations of n and p resulted in different proportions of VG that were accounted for by the marked QTL (Table 2) .


View this table:
[in this window]
[in a new window]
 
Table 2 Proportions of the total genetic variance explained by the marked quantitative trait loci (QTL) when different proportions (p) of the n QTL had known linkage to flanking markers.{dagger}

 
In T-BLUP, information on marker-QTL linkage was assumed unknown for all QTL (i.e., p = 0). The covariance between the i x i' and j x j' single crosses was (Stuber and Cockerham, 1966):

(1)
where fij = Malecot's coefficient of coancestry between i and j, and fi'j' = Malecot's coefficient of coancestry between i' and j'.

In TM-BLUP, information on marker–QTL linkage was known for the first pn QTL, which had the largest effects. The marker–QTL linkages were assumed unknown at the remaining (1 - p)n QTL. The genotypes at Qk were assumed unknown and were inferred from the genotypes at Mk and Nk. Definition of alleles according to their origin becomes necessary. The alleles homozygous in a given inbred are denoted in superscript, i.e., inbred i is homozygous for the alleles Mik, Qik, and Nik. The conditional covariance between single crosses, given the observed marker genotypes at Mk and Nk, was (Bernardo, 1998):


(2)
where given Gkobs, and given Gkobs. The calculation of Pr and Pr is outlined below.

Probability of Descent of a Marked QTL Allele
Suppose a and b are the parental inbreds of i, and j is not a direct descendant of i. The Pr term can be expressed in terms of the conditional probability that Qjk {equiv} Qak and Qjk {equiv} Qbk given Gkobs (Bernardo, 1998):

(3)


The conditional probability that Qik <- Qab, given Gkobs, was obtained from the expected frequencies of parental types, single crossovers, and double crossovers among Mk, Qk, and Nk (Table 1):

(4)




The values of Pr, Pr, Pr, and Pr were determined from the marker genotypes, and were equal to 1 or 0 if Mik or Nik was polymorphic between a and b. Suppose a has the +/+ genotype, b has the -/- genotype, and i has the +/+ genotype at Mk. In this example, and . But if a and b both have the +/+ genotype, the values of Pr and Pr are simply determined from the parental contributions to inbred progeny, i.e., the proportion of the genome derived directly by an inbred from each of its two parents. For example, if i is a BC1-derived inbred, a is the recurrent parent, b is the donor parent, and a and b have the +/+ genotype, then and . The inbreds in this study were derived from F2 populations. Thus, when Mk was not polymorphic between a and b.

Because Qik must have descended from either Qak or Qbk, Pr was calculated as .

Prediction of Single-Cross Performance
Let yT be a 500 by 1 vector of phenotypic values of the 500 tested single crosses. The performance of the 2525 untested single crosses was predicted as

(5)
where yU = 2525 by 1 vector of predicted performance of the untested single crosses, CUT = 2525 by 500 matrix of genetic covariances between the untested single crosses and the tested single crosses, and CTT = 500 by 500 phenotypic variance–covariance matrix among the tested single crosses. The elements of CUT as well as the off-diagonal elements of CTT were calculated using Eq. [1] in T-BLUP and Eq. [2] in TM-BLUP. The diagonal elements of CTT were equal to VG + VE in both procedures.

Simple correlations between the predicted performance and true genetic values of the 2525 untested single crosses were calculated. The mean correlation across the 50 repeats was calculated. Pairwise two-sided significance tests (P = 0.05) among the mean correlations were done by standard bootstrapping (Efron, 1979). The significance tests were based on 5000 bootstrap samples, each sample comprising 50 paired correlations randomly selected with replacement from the 50 repeats.


    Results and discussion
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 Theory and methods
 Results and discussion
 REFERENCES
 
The T-BLUP procedure was effective for predicting single-cross performance. The correlation between predicted and true performance of untested single crosses ranged from 0.60, with h2 = 0.2 and n = 100, to 0.96, with h2 = 1 and n = 10, 50, or 100 (Table 3) . The T-BLUP correlations increased as h2 increased. Estimates of h2 for maize grain yield, on an entry-mean basis, have ranged from about 0.4 to 0.6 (Bernardo, 1996a). With h2 equal to 0.4 or 0.6, the T-BLUP correlations ranged from 0.74 to 0.84 across different values of n. Previous research demonstrating the usefulness of T-BLUP was based on an empirical cross-validation procedure wherein the true genetic values of single crosses were unknown (Bernardo, 1996a). In this study, the high correlations between predicted and true genetic values, albeit from computer simulation, are further proof of the effectiveness of T-BLUP for identifying superior single-crosses prior to field testing.


View this table:
[in this window]
[in a new window]
 
Table 3 Mean correlations, across 50 repeats, between predicted performance and true genetic values of single crosses for T-BLUP and TM-BLUP{dagger} with different numbers of quantitative trait loci (n), trait heritability (h2), and proportions of quantitative trait loci with know linkage to flanking markers (p)

 
The usefulness of TM-BLUP for predicting single-cross performance increased as the proportion of QTL with known linkage to flanking markers approached p = 1, and as the map distance between a QTL and its flanking markers increased from 2.5 to 10 cM (Table 3). The advantage of TM-BLUP over T-BLUP decreased as h2 and the number of QTL controlling the trait increased. With n = 10 QTL and h2 = 0.2, the correlations obtained with TM-BLUP exceeded those obtained with T-BLUP by 0.02 to 0.09. But with n = 10 and h2 = 0.4 or 0.6, the TM-BLUP correlations were only greater than the T-BLUP correlations by 0.00 to 0.06. It is unlikely that a complex trait such as maize grain yield would be controlled by only n = 10 QTL. With n = 50 or 100, the TM-BLUP correlations exceeded those obtained by T-BLUP by only 0.00 to 0.03. These marginal increases in the correlation, for moderate h2 or a large number of QTL controlling the trait, were too small to be of any practical significance. They indicated that using markers to infer the identity by descent of QTL alleles does not improve the prediction of single-cross performance.

The reason for this lack of improvement can be deduced from the covariance between single crosses in T-BLUP (Eq. [1]) and in TM-BLUP (Eq. [2]) under simplifying assumptions. Assume p = 1 and that the probability that QTL alleles are identical by descent [i.e., Pr and Pr] is known at each locus. The covariance between single crosses in TM-BLUP becomes . For recombinant inbreds derived at random, the expectation of is equal to fij. Likewise, the expectation of is equal to fi'j'. The expectations of the covariance between Pr and VkA, between Pr and VkA, and between PrPr and VkD are equal to zero for recombinant inbreds derived at random. In other words, whether or not two inbreds have alleles at a particular QTL that are identical by descent does not depend on the magnitude of the testcross additive or dominance variance at that QTL. Consequently, the covariance between single crosses in TM-BLUP becomes equivalent to . This expression is equal to fijVA + fi'j'VA + fijfi'j'VD, i.e., the covariance between single crosses in T-BLUP. Therefore, the lack of improvement in the predictions with marker data was due to the identical expectations of the covariance between single-crosses in T-BLUP and TM-BLUP in the absence of selection.

When a trait is controlled by few QTL, the observed covariances between Pr and VkA, between Pr and VkA, and between PrPr and VkA may deviate substantially from zero because of sampling variation. This phenomenon accounts for the greater advantage of TM-BLUP over T-BLUP when the trait is controlled by fewer loci. The loss of advantage of TM-BLUP over T-BLUP could apply not only to single crosses, but also to predicting breeding values of individuals within a population, as proposed by van Arendonk et al. (1994). Markers may be most useful for dissecting complex quantitative traits that are controlled by many, rather than few, QTL. It is ironic that TM-BLUP is most useful when few QTL control the trait or when genetic gain is sought only at a limited subset of QTL. The latter situation may be true for germplasm introgression programs, wherein inbreds are derived from backcross populations with an adapted inbred as the recurrent parent and an exotic population as the donor parent.

Perhaps TM-BLUP might also be more effective when recombinant inbreds are derived through selection rather than at random. Selection during inbreeding may cause deviations between the average identity by descent across QTL (i.e., and and Malecot's coefficients of coancestry based on pedigree (fij and fi'j'). Selection may also cause nonzero covariances between Pr and VkA, between Pr and VkA, and between PrPr and VkD. Consequently, the covariance between single crosses would differ between T-BLUP and TM-BLUP. On the other hand, if n is large, the effect of selection at individual QTL may be too small to cause a difference in the covariance between single crosses with T-BLUP and TM-BLUP.

In this study, I assumed that (i) the genetic variances associated with QTL were known, (ii) the QTL were unlinked with each other, and (iii) epistasis was absent. In practice, errors in the estimation of VkA, VkA, VkD can only lead to further decreases in the effectiveness of TM-BLUP relative to T-BLUP. The assumption of unlinked QTL is unrealistic when many QTL exist in a finite genome. The effects of assuming unlinked QTL in TM-BLUP, when linkages among QTL exist, are unclear. If the QTL are linked with each other, Eq. [2] should be expanded to account for the covariance of effects at linked QTL, or might be used as an approximation. The effects of epistasis were assumed negligible for two reasons. First, epistatic variances have smaller contributions than VA(1), VA(2), and VD in the covariance between single crosses (Stuber and Cockerham, 1966). Secondly, epistatic variance is often small even when strong physiological epistasis is present. For example, epistatic variance comprises only 14% of VG with complementary gene action, i.e., 9:7 ratio in the F2 of a dihybrid cross.

The main conclusions from this study are that (i) T-BLUP is effective in predicting single-cross performance even with moderate h2 (i.e., 0.4–0.6), and (ii) molecular markers that flank QTL do not greatly improve the predictions, even when all QTL are tightly linked to flanking markers. Any advantage of TM-BLUP over T-BLUP decreases as the number of QTL increases. These results shed doubt on the usefulness of TM-BLUP in applied breeding programs, not only for predicting single-cross performance, but also for selecting individuals within plant or animal populations.


    ACKNOWLEDGMENTS
 
I thank Wyman E. Nyquist for helpful comments regarding the manuscript.


    NOTES
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 Theory and methods
 Results and discussion
 REFERENCES
 
Research supported in part by Purdue Agric. Research Programs, Hatch Project IND050035. Journal Paper no. 15762.

Received for publication October 16, 1998.


    REFERENCES
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 Theory and methods
 Results and discussion
 REFERENCES
 




This article has been cited by other articles:


Home page
Crop Sci.Home page
A. R. Hallauer
History, Contribution, and Future of Quantitative Genetics in Plant Breeding: Lessons From Maize
Crop Sci., December 18, 2007; 47(Supplement_3): S-4 - S-19.
[Abstract] [Full Text] [PDF]


Home page
Crop Sci.Home page
A. M. Bauer, T. C. Reetz, and J. Leon
Estimation of Breeding Values of Inbred Lines using Best Linear Unbiased Prediction (BLUP) and Genetic Similarities
Crop Sci., November 21, 2006; 46(6): 2685 - 2691.
[Abstract] [Full Text] [PDF]


Home page
Crop Sci.Home page
R. Bernardo
What If We Knew All the Genes for a Quantitative Trait in Hybrid Crops?
Crop Sci., January 1, 2001; 41(1): 1 - 4.
[Abstract] [Full Text] [PDF]


Home page
Crop Sci.Home page
R. Bernardo
Breeding Potential of Intra- and Interheterotic Group Crosses in Maize
Crop Sci., January 1, 2001; 41(1): 68 - 71.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (11)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Bernardo, R.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Bernardo, R.
Agricola
Right arrow Articles by Bernardo, R.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
The SCI Journals Agronomy Journal Vadose Zone Journal
Journal of Plant Registrations Soil Science Society of America Journal
Journal of Natural Resources
and Life Sciences Education
Journal of
Environmental Quality