|
|
||||||||
Inst. of Plant Breeding, Seed Sci., and Population Genetics, Univ. of Hohenheim, 70593 Stuttgart, Germany
* Corresponding author (melchinger{at}uni-hohenheim.de).
| ABSTRACT |
|---|
|
|
|---|
Abbreviations: CIM, composite interval mapping cM, centiMorgan CV, cross validation DS, data set ES, estimation set IV, independent validation LOD, log odds ratio LR, likelihood ratio MAS, marker-assisted selection p, proportion of the genetic variance P1, parent one P2, parent two QTL, quantitative trait locus/loci RFLP, restriction fragment length polymorphism TC, testcross TS, test set
| INTRODUCTION |
|---|
|
|
|---|
In contrast, congruency of QTL between different populations seems to be rather common for crosses of highly divergent parent lines and complex but easily classified morphological traits. In interspecific crosses, QTL with mostly drastic effects mapped to the same genomic sites or even syntenic regions (for review see Beavis, 1998). Likewise, Mackay (1995)(1996) and Long et al. (1995) reported for the highly heritable trait bristle number in Drosophila a clustering of QTL from different populations in the vicinity of candidate loci.
Important factors influencing QTL congruency are the sample size employed in QTL mapping as well as the approach used for comparing the QTL detected. With mostly limited sample sizes of mapping populations, the error in estimates of QTL number, positions, and effects is generally high, especially for polygenic traits (Otto and Jones, 2000; Beavis, 1998; Broman, 2001; Utz and Melchinger, 1994). Therefore, criteria for assessing QTL congruency should allow discrimination between incongruency caused by biological or biometrical reasons.
Three criteria have been proposed in the literature for investigating the congruency of QTL: (i) counting of QTL at congruent genomic sites across the genome as used in numerous studies; (ii) permutation test of correspondence between genome-wide generated log odds ratio (LOD) score profiles described by Keightley and Knott (1999); (iii) genetic correlation between predicted and observed phenotypic values in an independent sample having special appeal with regard to MAS (Lande and Thompson, 1990; Melchinger et al., 1998; Utz et al., 2000). Determining congruency implies comparisons of at least two samples by use of either an additional independent validation (IV) sample or CV. We applied all three criteria and both validation methods to compare QTL results for traits of presumably different complexity from five populations with both, one, or none of the three elite parents in common.
Our objectives were to (i) determine the positions and gene effects of QTL detected in each of the five populations, (ii) compare QTL congruency across populations by all three criteria, (iii) discuss the influence of the sample and genetic background on QTL congruency for different traits, and (iv) draw conclusions regarding the prospects of MAS in plant breeding.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Field Experiments
The TC progenies were evaluated in five experiments. Experiment 1 (A x BI) was conducted in 1990 and 1991 at two locations in Germany (Gondelsheim and Grucking) as described by Melchinger et al. (1998). The 400 entries consisted of 380 TCs of F3 lines, TCs of parents A and B included as quintuple entries, and 10 common check hybrids. In addition, data on plant height were taken from forage trials conducted at five environments in Germany as described by Lübberstedt et al. (1997). Experiment 2 (A x BII) was conducted in 1992 and 1993 at two locations in Germany (Eckartsweier and Bad Krozingen). The 150 entries consisted of TCs of the 127 F3 lines, TCs of the parents A and B included as six and seven entries, respectively, and the same set of 10 check hybrids as in Exp. 1. Because of insufficient quantities of seeds, TC progenies of only 71 F5 lines of cross A x BIII were evaluated in Exp. 3, 109 F4 lines (A x C) in Exp. 4, and 84 F4 lines (C x D) in Exp. 5, conducted in 1992 in adjacent trials at five locations with rather diverse agroecological conditions (Chartres in France; Eckartsweier, Grucking, Bad Krozingen, and Gondelsheim in Germany). Experiments 3 to 5 each included 150 entries. Testcrosses of each parent line were included as quintuple entries in each experiment as well as common check hybrids and other lines for completion. The experimental design employed was a 40 x 10
-design (Patterson and Williams, 1976) for Exp. 1 and a 15 x 10
-design for the remaining experiments, with two replications each. Two-row plots were overplanted and later thinned to reach a final stand of 80 000 to 110 000 plants ha1 depending on the location. All experiments were machine planted and harvested as grain trials with a combine.
Data were analyzed for the following traits: grain yield (Mg ha1) adjusted to 155 g kg1 grain moisture, grain moisture (g kg1) at harvest, kernel weight in mg per kernel determined from four samples of 50 kernels from each plot, protein concentration in grain (g kg1) estimated by near-infrared reflectance spectroscopy as described by Melchinger et al. (1986), and plant height (cm) on a plot basis as the distance from the soil level to the lowest tassel branch.
RFLP Marker Genotyping and Linkage Map Construction
The procedures for RFLP assays were described by Schön et al. (1994). A subsample of 344 parental F2 plants of the 380 F3 lines of A x BI, and a subsample of 109 parental F2 plants of the 127 F3 lines of A x BII were genotyped for a total of 89 RFLP marker loci distributed across the maize genome. A total of 151, 104, and 122 RFLP marker loci were employed to map 113 F5 lines of A x BIII, as well as 131 and 140 F4 lines of crosses A x C and C x D, respectively. Observed genotype frequencies at each marker locus were tested against expected Mendelian segregation ratios and allele frequency 0.5 by
2 tests. Appropriate type I error rates were determined by the sequentially rejective Bonferroni procedure (Holm, 1979). Linkage maps of the individual populations, as well as a joint map combining the molecular data of all populations, were constructed with software JOINMAP Version 3.0 (Van Ooijen and Voorrips, 2001). A LOD threshold of 3.0 was used for declaring linkage in two-point analyses and Haldane's mapping function (Haldane, 1919) was employed for calculating map distances. For the joint map, each linkage group was truncated at both ends. The points of truncation were the most distal markers common to all individual maps.
Agronomic Data Analyses
Analyses of variance were performed for each experiment and environment. Adjusted entry means and effective error mean squares were then used to compute the combined analyses of variance and covariance across environments for each experiment. The sums of squares for entries were subdivided into the variation among TCs of the Fn lines and orthogonal contrasts among the TC means of parent lines P1 and P2 and Fn lines. A corresponding subdivision was conducted on the entry x environment interaction sums of squares. Estimates of variance components
2e (effective error variance),
2ge (genotype x environment interaction variance) and
2g (genotypic variance) of Fn TC progenies and their standard errors were calculated as described by Searle (1971)(p. 475). Heritabilities (h2) on a TC progeny mean basis were estimated as described by Hallauer and Miranda (1981)(p. 90) and their 95% confidence intervals according to Knapp et al. (1985). Phenotypic (
p) and genotypic (
g) correlations between the TC performance of F5 lines of A x BIII and F3 lines of A x BII were calculated for all traits by standard procedures (Mode and Robinson, 1959).
Quantitative Trait Loci Analyses
Quantitative trait loci mapping and estimation of their effects were performed with PLABQTL (Utz and Melchinger, 1996) employing CIM by the regression approach (Haley and Knott, 1992). All QTL analyses were performed with the joint map. An additive genetic model was assumed for the analysis of TC progenies as described in detail by Utz et al. (2000). Cofactors were selected by stepwise regression according to Miller (1990)(p. 49) with an "F-to-enter" and "F-to-delete" value of 3.5. Testing for presence of a putative QTL in an interval by a likelihood ratio (LR) test was performed with a 2.5 (= 0.217 LR) LOD threshold in conformity with the foregoing publications on these materials. We also set higher LOD thresholds of 3.5 in A x BII and A x BIII as well as 5.0 in A x BI for certain comparisons across samples. Estimates of QTL positions were obtained at the point where the LOD score assumed its maximum in the region under consideration. For each population, the proportion of the phenotypic variance (
2p) explained by a single QTL was determined as the square of the partial correlation coefficient (R2). Estimates of the allele substitution effect (
) of each putative QTL and their partial R2 were obtained by fitting a model including all significant QTL for the respective trait simultaneously. This model was also used to estimate pDS, the proportion of the genotypic variance (
2g) explained by all QTL detected with the whole data set (DS) for a given trait, by dividing the adjusted total R2 (R2adj) by the heritability (h2) as described by Utz et al. (2000).
Fivefold CV implemented in PLABQTL was used to obtain asymptotically unbiased estimates of pDS (Shao, 1997). For each population, a DS comprising the entry means across environments was divided into five genotypic subsamples. Four of these were combined in an estimation set (ES) for QTL detection and estimation of genetic effects, whereas the remaining subsample was used as a test set (TS) to validate the predictions gained from ES. We call this analysis standard CV. This analysis deviates from CV/G described by Utz et al. (2000), where the ES and TS were defined by omitting one environment of a DS. Here, data from all environments was averaged to obtain phenotypic values, and therefore only five different CV runs are possible by permuting the respective subsamples. A total of 1000 replicated CV runs was performed with 200 randomizations for assigning genotypes to the respective subsamples. Estimates of the proportion of the genotypic variance (
2g) explained by all QTL detected for a given trait were calculated as medians
ES from the 1000 estimates in ES. The validated median
TS.ES was obtained by correlating the observed data in TS with those predicted on the basis of QTL positions and effects estimated in ES. An ad hoc estimate of the bias of pDS was calculated by the difference of medians
ES
TS.ES. The bias of an individual QTL effect in a DS was estimated as the difference of means
ES
TS.ES by averaging across all CV runs which contained the individual QTL of a DS within a ±10-cM interval of the QTL position estimated by CIM in a DS. Hereby,
ES is the mean estimate in ES, and
TS.ES the result of its validation in TS at the QTL position of ES. Within the same interval, the QTL frequency (i.e., the frequency of occurrence of a putative QTL) was determined across the 1000 CV runs.
Three procedures were employed for quantifying the congruency of QTL across populations: (i) number of congruent QTL, whereby individual QTL were considered congruent across two populations if their estimated map position was within a 20-cM distance, irrespective of the sign of estimated
-effects in the two populations; (ii) correlation of LOD score values r (LODi, LODj) (i, j = A x BI, A x BII, A x BIII, A x C, and C x D; i
j) from populations i and j across the genome (Keightley and Knott, 1999), with significance thresholds for r at the 5% level determined as the 2.5 and 97.5 percentiles of 2000 permutations; (iii) the genetic correlation between predicted and observed TC performance, rg (Mi, Yj) (i, j = A x BI, A x BII, A x BIII, A x C, and C x D; i
j). For brevity, a particular rg (Mi, Yj) will be denoted as rg (A x BI, A x BII), for example. Here, Mi is the predicted value based on the QTL positions and effects estimated in the population i (estimation population) and Yj is the observed value in the population j (validation population). For details, see Utz et al. (2000). The parameter rg (Mi, Yj) was estimated for all pairs of populations, except those having no parent in common. The assumption was that in crosses with one parent in common the other parent contributes same allelic effects at the QTL in both crosses. If i and j represent populations of the same cross, rg (Mi, Yj) will be comparable with
derived from CV within the population i.
| RESULTS |
|---|
|
|
|---|
The joint map spanned a total of 1138 cM with an average interval length of 14.4 cM in A x BI and A x BII, 15.0 cM in A x BIII, 12.1 cM in A x C, and 10.2 cM in C x D. This map covered approximately 70% of the genome covered by the reference map (Schön et al., 1994) and 84% of the QTL regions detected by Melchinger et al. (1998) in A x BI across traits.
In total, six marker loci in populations A x BI and A x BII, and three in A x C were scored as dominant markers. For markers of the joint map, the observed genotype frequencies generally coincided with the expected Mendelian segregation ratios in A x BII. Significant deviations were observed once in A x BI and A x BIII, twice in A x C, and in five cases in C x D. Significant (P < 0.01) deviations from 0.5 allele frequency were not found. The joint map is available at http://www.agron.missouri.edu (verified 20 Aug. 2003).
Agronomic Trait Analysis
Herein, only the results for populations A x BIII, A x C, and C x D will be presented because agronomic data of populations A x BI and A x BII was reported previously (Schön et al., 1994; Melchinger et al., 1998). Weather conditions were mostly favorable for grain maize production in all five environments, except for noticeable drought stress at Chartres reflected in reduced plant height and kernel weight estimates. The TC progeny means of population A x BIII (
5) exceeded TC progeny means of A x C and C x D (
4) for kernel weight and protein concentration (Table 1). For grain yield and plant height, the highest TC progeny means were obtained in C x D, whereas for grain moisture, TC mean of A x C was highest (Table 1). The TC means of P1 and P2 differed significantly (P < 0.01) for all traits except grain yield in C x D and grain moisture in A x BIII. The orthogonal contrast between average TC performance of the parent lines (
) and the TC mean of the Fn lines (
n) was significant (P < 0.01) only for protein concentration in population A x C. For all traits and populations, the range in TC performance of Fn lines considerably exceeded the TC means of the parents.
|
2g) among TCs of Fn lines were highly significant (P < 0.01) for all traits in all populations (Table 1). Genotypic variances among F5 lines (A x BIII) were significantly higher (P < 0.01) than those among F3 lines in A x BI and A x BII. Estimates of genotype x environment interaction variance (
2ge) were significantly greater than zero (P < 0.05) for all traits in all populations. Except for grain yield,
2ge was consistently smaller than
2g. Heritability was medium for grain yield (0.61 <
2 < 0.70), but relatively high for the other traits (0.84 <
2 < 0.93) in all three populations. Phenotypic correlations (
p) between related TC progenies from F3 lines (A x BII) and F5 lines (A x BIII) were highly significant (P < 0.01) for all traits. Corresponding genotypic correlations (
g) ranged from 0.32 to 0.62.
Quantitative Trait Loci Analyses
The QTL results for A x BI and A x BII were reported previously (Schön et al., 1994; Melchinger et al., 1998). Results from QTL analyses of all five populations based on the joint map are presented here for means across environments: the proportion of the genotypic variance explained in Table 2 and the number of QTL detected in Table 3. Detailed information on positions and effects of individual QTL detected can be obtained at http://www.agron.missouri.edu.
|
|
2p, and between
DS = 25.7 (A x BII) and 83.2% (A x C) of
2g (Table 2). Across populations, the sum of absolute
-effects ranged from 0.92 (A x BI) to 4.07 Mg ha1 (A x BIII), corresponding to 8.9 and 45.6% of the TC means of F3 and F5 lines, respectively. Cross validation resulted in
TS.ES values ranging from 6.0 (A x BII) to 51.8% (A x C), which were substantially smaller than
ES values (Table 2).
Grain Moisture
We detected nine, four, three, seven, and six QTL for grain moisture in A x BI, A x BII, A x BIII, A x C, and C x D, respectively, distributed across the genome (Table 3). Collectively, they accounted for R2adj = 23.2% of
2p in A x BIII and 37.6% in A x BI, the minimum and maximum obtained for the five populations. The proportion of
2g explained by all putative QTL ranged from
DS = 26.2 (A x BIII) to 46.0% (A x BI) (Table 2). The sum of absolute
-effects was between 22.3 g kg1 in A x BIII (7.9% of
5) and 51.6 g kg1 in C x D (18.6% of
4). With CV,
TS.ES values ranged from 2.5 (C x D) to 33.0% (A x BI), which were considerably lower than the corresponding
ES values (Table 2).
Kernel Weight
Ten QTL regions across the genome were significantly associated with kernel weight in population A x BI, two in A x BII, three in A x BIII, and four in A x C and C x D (Table 3). A simultaneous fit yielded a minimum R2adj = 8.3% in A x BII and a maximum R2adj = 43.9% in A x BI. Simultaneously, all putative QTL explained between 10.5 (A x BII) and 51.9% (A x BI) of
2g (Table 2). The sum of absolute
-effects varied between 15.3 g in A x BII and 63.8 g in A x BI (4.7 and 20.5% of the TC mean of F3 lines, respectively). Estimates of
TS.ES ranged from 13.5 (A x C and C x D) to 42.3% (A x BI), and were substantially lower than corresponding estimates of
ES (Table 2).
Protein Concentration
Nine QTL were identified for protein concentration in A x BI, four in C x D, and six QTL in each of the populations A x BII, A x BIII, and A x C distributed across the genome (Table 3). Collectively, they explained between R2adj = 34.7% in A x C and 51.4% in A x BIII. Estimates of
DS ranged from 39.6 (A x C) to 56.0% (C x D) (Table 2). The sum of absolute
-effects varied from 11.3 g kg1 in C x D (10.3% of
4 lines) to 18.0 g kg1 in A x BIII (15.5% of
5 lines). Cross validation yielded estimates of
TS.ES between 9.8% in A x BIII and 38.9% in A x BI, being substantially reduced as compared with corresponding
ES values (Table 2).
Plant Height
A total of 12, 3, 1, 5, and 3 QTL affecting plant height was detected in A x BI, A x BII, A x BIII, A x C, and C x D, respectively (Table 3). A simultaneous fit explained between R2adj = 10.0 (A x BIII) and 52.6% (A x BI) of
2p, and between 11.2 (A x BIII) and 66.5% (A x BI) of
2g (Table 2). The largest sum of absolute
-effects was 48.4 cm in A x BI (19.3% of
3 lines), the smallest amounted to 6.8 cm in A x BIII (2.96% of
5 lines). Cross validation yielded estimates of
TS.ES ranging from 0.3 (A x BIII) to 49.3% of
2g (A x BI), which were considerably smaller than their corresponding
ES estimates (Table 2).
Comparison of QTL across Populations
Comparing different samples of the same generation in the same cross, seven out of 18 QTL detected in the smaller population (A x BII) were found to be within a 20-cM distance from the 42 QTL detected in the larger population (A x BI) across all five traits (Table 3). For grain yield, no common QTL was detected. The genome-wide correlation of LOD-score values for A x BI and A x BII was significant (P < 0.05) only for kernel weight and plant height (Table 4). The genetic correlation rg (A x BI, A x BII) ranged from 0.26 for grain yield to 0.63 for kernel weight (Table 3).
|
In the comparison of populations having one parent in common, out of the 28 QTL detected in A x C across all five traits, only 10, 8, and 7 were common to the QTL detected in A x BI, A x BII, and A x BIII, respectively (Table 3). The genome-wide correlation of LOD scores between A x C and A x BI was significant (P < 0.05) only for kernel weight (Table 4). This was also the case when A x BIII was compared with A x C; however, when comparing A x BII vs. A x C, no significant correlations were obtained (data not shown). For most traits, rg (A x BIII, A x C) was mostly higher than rg (A x BI, A x C) or rg (A x BII, A x C). The first correlation refers to populations evaluated in the same environments, which is not the case for the other two correlations. Estimates of rg (A x BI, A x C) were of medium size (0.46) for kernel weight but considerably lower for other traits. Only four out of 28 QTL identified in A x C were in common to the 23 QTL detected in C x D across traits (Table 3). The genome-wide correlations of LOD scores between A x C and C x D were close to zero for all traits (Table 4). The correlations rg (A x C, C x D) ranged from 0.09 (plant height) to 0.66 (grain yield) despite the fact that for grain yield only one QTL was in common to both populations.
In the comparison of populations having no parent in common, out of the 23 QTL detected across all five traits in C x D, only two to four were in common with A x BI, A x BII, and A x BIII (Table 3). The genome-wide correlation of LOD scores was practically zero for all traits when comparing A x BI vs. C x D (Table 4). This was also the case when comparing A x BII or A x BIII vs. C x D (data not shown).
| DISCUSSION |
|---|
|
|
|---|
The second criterion, the correlation coefficient between LOD score profiles overcomes this deficiency. As Keightley and Knott (1999) concluded from simulations and experimental results, however, the correlation coefficients were low and the power to detect congruency decreased already with several QTL underlying the trait. This was corroborated in our study because significant associations were obtained only if one or few large QTL were congruent. Small differences in QTL positions often reduced the correlation substantially. Therefore, we agree with Keightley and Knott on not using this criterion for complex polygenic traits.
Our third criterion, the genetic correlation between predicted and observed phenotypic values, rg (Mi, Yj), estimates the QTL congruency quantitatively by taking into account both positions and effects of QTL. It deals adequately with cases of linked QTL (e.g., two linked QTL in a large sample or a ghost QTL in a smaller sample) and is best suited for assessing the prospects of MAS because it corresponds to the square root of the proportion of genetic variance explained by QTL. A shortcoming is the large estimation error associated with rg (Mi, Yj) if the heritability is low, because the latter occurs in the denominator of the formula. Furthermore, same allelic effects at the QTL must be assumed if populations share one or no parent.
Impact of Shortcomings in QTL Analyses on QTL Congruency across Samples
Lack of QTL congruency across different samples of the same cross reflects the limitations and shortcomings of QTL analyses. They depend on (i) random errors associated with phenotypic and marker data, (ii) sampling of genotypes and environments, and (iii) bias caused by model selection in QTL analyses.
The first factor was presumably of minor importance for explaining the poor QTL congruency between the three populations of A x B, because our phenotypic values referred to means across four or five environments and heritabilities were fairly high for all traits except grain yield (Table 1).
Genotypic sampling influences QTL detection and estimation of their positions and effects to a much higher extent than environmental sampling with more than three environments (Utz et al., 2000). This was corroborated herein also for grain yield, the trait with the highest expected G x E interaction variance. Estimated QTL x E interaction variance components in the PLABQTL analysis were mostly small compared with the QTL variance components across populations, except for A x C, where the two variance components were of similar size. The genetic variance explained by all putative QTL detected in A x C remained high with
TS.ES = 51.8% after standard CV (Table 2). With CV on independent environmental and genotypic samples (i.e., CV/GE in Utz et al. [2000]), however, the above estimate was reduced to
TS.ES = 22.1%. The reason may be the fact that two QTL detected in A x C showed different signs across the five test environments. In such a case, the environmental sample may influence the size of the QTL effect in the mapping population and consequently reduce the QTL congruency with the other populations.
Model selection in QTL mapping can introduce a bias and cause a substantial inflation in QTL estimates (Utz and Melchinger, 1994; Georges et al., 1995; Beavis, 1998; Broman, 2001; Göring et al., 2001). As demonstrated by simulations of these authors, the bias in estimates of individual QTL effects as well as p can be as high as the true parameters, with the bias and sampling error increasing for small sample sizes and small effects of the QTL.
By the same token, the power of QTL detection increases for larger sample sizes and effects of QTL. Assuming a QTL with an estimated R2 = 0.10, which corresponds to the average value across all traits and QTL determined in our study, the power of detecting such a QTL is 0.98 for N = 500 but only 0.65 for N = 100 (Charcosset and Gallais, 1996). The probability of detecting such a QTL simultaneously in two independent samples is obtained by multiplication. Taking bias into account, the true QTL effect is only about half as large as the estimated QTL effect, which reduces the probability of joint QTL detection in both samples to 0.30. This value is in close agreement with the proportions of congruent QTL detected in A x BI vs. A x BII or A x BIII. The QTL congruency is further reduced if a constant Type I error level is chosen because our 2.5 LOD threshold corresponds to a level of 0.14 in A x BI, 0.23 in A x BII, and 0.40 in A x BIII with use of the permutation test of Doerge and Churchill (1996).
In conclusion, genotypic sampling and estimation bias can largely explain the low rate of congruency between QTL detected in different samples of the same cross. Consequently, with a low power of QTL detection it remains an open question whether incongruency was due to sampling error or due to genetic causes, as there may be different QTL x environment interactions when populations are grown in different environments or different allelic effects at QTL in the case of different crosses.
Information Gain from Cross Validation
Resampling methods such as CV have been proposed to determine the sampling error and bias of QTL estimates (Utz et al., 2000). By a comparison of CV results from populations A x BI, A x BII, and A x BIII, we examined whether CV permits assessment of (i) the power of QTL detection by looking at QTL frequencies, (ii) the bias and standard error of individual QTL effects, and (iii) the bias in p calculated as the difference in corresponding estimates from ES and TS. For a summary across traits, QTL effects were standardized by dividing the estimated substitution effects by the phenotypic standard deviation of entry means.
The fidelity of QTL detection was assessed by QTL frequency, which corresponds to the percentage of the 1000 CV runs, in which the QTL was detected in the ±10-cM interval of the QTL position found by CIM in a DS. As expected, the QTL frequency decreased with decreasing sample size and averaged 0.74 in A x BI, 0.54 in A x BII, and 0.46 in A x BIII. Even with N = 344 in A x BI, the QTL frequency exceeded 0.95 only for seven out of the 42 detected QTL. In the smaller samples, the maximum QTL frequency amounted to 0.88. In all three populations, the QTL frequency was significantly correlated with the LOD scores and the absolute standardized QTL effects, which corroborates that it is a good indicator of the power of QTL detection.
The average of the standardized QTL effects across all five traits amounted to 0.34 in A x BI, 0.47 in A x BII, and 0.38 in A x BIII. These differences are largely attributable to the increased bias of QTL effects estimated from smaller populations because the CV bias of standardized QTL effects averaged 0.06 in A x BI, but 0.18 in A x BII and A x BIII. Large estimated QTL effects generally displayed a smaller bias than the smaller ones. The CV also revealed a large variation in QTL effects estimated from TS in different runs. The variation of estimated bias was also smaller in the group of larger QTL than in the group of smaller QTL, especially in the large population A x BI. Hence, for smaller populations our results corroborate the findings of Göring et al. (2001) that the estimated QTL effects may be virtually independent of the true size of the QTL. Moreover, IV corresponds essentially to a single CV run and shows high standard errors of QTL effects when using small sample sizes unless a QTL is very large.
While individual QTL effects often deviated considerably between CV and IV, estimates of p (
TS.ES) averaged across traits from CV and r2g (Mi, Yj) from IV showed good agreement if the large population A x BI was used for QTL mapping (Table 5). This confirms that CV provides asymptotically unbiased estimates of p (Shao, 1997). The LOD thresholds for these comparisons were set higher than 2.5 as we found the congruency to be mostly due to largest QTL.
|
Trait-Specific QTL Congruency
Falconer and Mackay (1996)(p. 357) designated QTL explaining >10% of the phenotypic variance or their standardized effects exceeding 0.5, respectively, as "large." The standardized effects averaged across the three populations of the cross A x B were <0.5 as already discussed. However, at least one large QTL was found in each population and for each trait. Although these large QTL were not necessarily detected at congruent positions across populations, for kernel weight, protein concentration, and plant height they could have been detected even with higher LOD thresholds (3.5 for A x BII and A x BIII, 5.0 in A x BI) and contributed substantially to the high genome-wide congruency evidenced by genetic correlations rg (Mi, Yj) in Table 3. Large QTL did not act accordingly for grain yield and grain moisture, which may be due to high estimation error or a higher number of small QTL underlying these traits. Moreover, presence of highly integrated epistatic complexes (Stuber et al., 1999) or varied control of these traits via metabolic pathways (Bost et al., 1999) may be other causes for this result.
With sample sizes typically used in QTL mapping experiments, it seems unrealistic to unravel the genetic architecture of polygenic traits. Even with N = 344 in A x BI, one can make only cautious inferences concerning the importance and width of a QTL region. Limitations are already manifest in detecting the true number of QTL (Otto and Jones, 2000) and furthermore in estimating the degree of dominance and epistasis of a given trait.
Congruency of QTL from Different Crosses
Owing to the high selection pressure exerted in maize breeding programs, it seems plausible that the same favorable alleles are fixed at a QTL in both parents of a cross within the same heterotic group. Thus, polymorphism at a QTL in one but its absence in the other cross could be a biological cause for incongruency. Furthermore, the divergence of the parental lines of two crosses will be reflected in magnitude and direction of effects found for QTL at congruent positions. Moreover, epistasis can modulate the effect of a QTL depending on the genetic background. Hence, it is not surprising that we found no QTL congruent among all crosses.
Congruency as evidenced by the genetic correlations rg (Mi, Yj) was generally diminished if one of the parents varied between crosses. A noticeable higher value of rg (A x C, C x D) was found for grain yield due to a large congruent QTL on chromosome 1. The higher rg values of A x B populations with A x C for kernel weight were also mostly attributable to a large congruent QTL on chromosome 8. It is striking that in other QTL studies in maize, QTL for grain yield and its components were reported on the same region of chromosome 1 and on chromosome 8 (Abler et al., 1991; Beavis et al., 1994; Austin and Lee, 1996; Veldboom and Lee, 1996). Each of these QTL may represent either a gene complex or individual genes controlling a specific metabolic pathway or gene network.
Alternative approaches to QTL mapping that do not rely on biparental crosses might provide new tools for investigating the congruency of QTL in different populations. Besides QTL mapping in multiple-line crosses (Rebai and Goffinet, 2000; Xie et al., 1998; Xu, 1998; Liu and Zeng, 2000), the haplotype-based QTL mapping approach recently devised by Jansen et al. (2003) promises progress in this direction, because it can be applied to progeny from multiple related crosses. Furthermore, congruent QTL across different genetic backgrounds can be confirmed by association mapping (Meuwissen and Goddard, 2000; Thornsberry et al., 2001), if candidate genes and/or high density maps are available.
Implications for Marker-Assisted Selection and QTL Mapping
The high estimation error and low power explain why in most published experiments on MAS, only about half of the QTL under selection actually contributed to the realized selection response (Eathington et al., 1997; Mather et al., 1997; Igartua et al., 2000; Bouchez et al., 2002). Obviously, the chances for MAS are substantial if at least a few large QTL are detected, even if some of them are false positives or overestimated.
Marker-assisted selection should be promising in our material for some traits such as kernel weight, protein concentration, and plant height because independent samples of the same cross yielded congruent QTL and explained up to 46% of the genetic variance. For these traits, genetic correlations between A x BII and A x BIII, for example, based on the whole genotype (Table 1) corresponded well to the rg (A x BII, A x BIII) based on the QTL genotype (Table 3). Nevertheless, even for these traits we recommend the use of a large population for mapping at least of a size of 300 correspondingly to the one used in this study for A x BI (N = 380). The p values estimated from validation were still below the corresponding h2 estimates; consequently, MAS will be superior to phenotypic selection only if it is more cost-effective (Lande and Thompson, 1990; Knapp, 1998).
In view of the high costs of QTL mapping experiments, it would be advantageous if QTL regions were consistent among crosses and only the most suitable flanking marker and the sign of the QTL allele would have to be determined for each population. Remapping of QTL at regular intervals during MAS is necessary because QTL-marker associations change during several generations of selection (Gimmelfarb and Lande, 1995). A multistage approach with estimation of QTL in one generation and with validation and combined estimation in the next generation would allow for an efficient use of both phenotypic and marker data. An essential prerequisite for this approach is the integration of QTL mapping in ordinary breeding programs with elite germplasm, as suggested by Jannink et al. (2001).
| ACKNOWLEDGMENTS |
|---|
Received for publication March 24, 2003.
| REFERENCES |
|---|
|
|
|---|