Crop Science Grow Your Career with CSSA
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Published online 1 February 2006
Published in Crop Sci 46:614-621 (2006)
© 2006 Crop Science Society of America
677 S. Segoe Rd., Madison, WI 53711 USA
This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (10)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Bernardo, R.
Right arrow Articles by Charcosset, A.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Bernardo, R.
Right arrow Articles by Charcosset, A.
Agricola
Right arrow Articles by Bernardo, R.
Right arrow Articles by Charcosset, A.
Related Collections
Right arrow Crop Genetics
Right arrow Maize

CROP BREEDING, GENETICS & CYTOLOGY

Usefulness of Gene Information in Marker-Assisted Recurrent Selection: A Simulation Appraisal

Rex Bernardoa,* and Alain Charcossetb

a Dep. of Agronomy and Plant Genetics, Univ. of Minnesota, 411 Borlaug Hall, 1991 Upper Buford Cir., St. Paul, MN 55108
b Institut National de la Recherche Agronomique, Station de génétique végétale, Ferme du Moulon, 91190 Gif-sur-Yvette, France

* Corresponding author (bernardo{at}umn.edu)


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Genomics and post-genomics sciences are expected to uncover most, if not all, of the quantitative trait loci (QTL) in plants. Prior knowledge of QTL locations can then be exploited in marker-assisted recurrent selection (MARS). Our objectives were to determine (i) whether prior knowledge of QTL locations is advantageous in MARS, and (ii) whether knowledge of the QTL themselves, as opposed to knowledge of markers linked to QTL, is advantageous in MARS. We simulated MARS in a maize (Zea mays L.) F2 population. We found that when 10 QTL controlled the trait, the percentage of known QTL that maximized the response to MARS was PMax = 100%. In contrast, PMax was often less than 100% when 40 or 100 QTL controlled the trait and QTL effects were estimated with a population size (N = 100) typically used in MARS. This result implied it was advantageous to exploit only the QTL with large effects and ignore those with small effects, even if the locations of all QTL were known. For a trait controlled by 40 QTL, the response was up to 50% greater when PMax = 70% of the QTL were known through markers for the QTL themselves rather than through linked markers. We conclude that having known QTL in MARS is most beneficial for traits controlled by a moderately large number of QTL (e.g., 40). We speculate that a combination of approaches would be needed to exploit information on markers for QTL themselves, markers linked to QTL, and unknown QTL.

Abbreviations: cM, centimorgan • MARS, marker-assisted recurrent selection • QTL, quantitative trait loci


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
MAIZE BREEDING comprises three stages: (i) population improvement, (ii) inbred development, and (iii) hybrid development. Maize seed companies have successfully exploited marker-QTL associations in population improvement (Edwards and Johnson, 1994; Johnson, 2001, 2004; Koebner, 2003). Specifically, marker-assisted recurrent selection or MARS refers to the improvement of an F2 population by one cycle of marker-assisted selection (i.e., based on phenotypic data and marker scores) followed by three cycles of marker-based selection (i.e., based on marker scores only). With the use of year-round nurseries MARS adds only 1 yr to the time required to develop maize inbreds (Koebner, 2003). Johnson (2004) reported that averaged across six proprietary F2 maize populations, grain yield increased from 10.0 Mg ha–1 in cycle 0 to 10.5 Mg ha–1 in cycle 1 and to 10.8 Mg ha–1 in cycle 2. Such increases in the population mean lead to increases in the performance of inbreds derived from the population.

The MARS procedures used to date have relied on ad hoc significance tests (i) to identify markers associated with the trait and, subsequently, (ii) to estimate the effect associated with each marker. These ad hoc significance tests are done separately for each population undergoing MARS. Advances in genomics and post-genomics sciences, however, are expected to uncover most, if not all, of the genes for important traits in crops. Such gene discovery is expected through different approaches including candidate gene analysis, sequence homology comparisons, gene expression profiles, proteomics, metabolomics, and phenomics (Bowen and Luedtke, 1997; Somerville and Somerville, 1999; Thiellement et al., 2002; Fiehn, 2002; Gerlai, 2002). If the genes underlying a quantitative trait become known, then their locations would no longer need to be determined by ad hoc significance tests in MARS. Only the quantitative effects associated with the actual QTL would need to be estimated. In this manuscript we define known QTL as QTL whose genomic locations are known a priori but whose effects in terms of the quantitative trait need to be estimated in the breeding population. The known QTL may be identified through markers for the QTL themselves, or through functionally neutral markers linked to the actual QTL.

Results from a previous study (Bernardo, 2001) suggest that having known QTL would enhance MARS. Bernardo (2001) investigated the usefulness of known QTL in inbred and hybrid development. He found that best linear unbiased prediction based on pedigree information was highly effective for predicting hybrid performance. Due to this high effectiveness, having known QTL did not substantially enhance hybrid development. In contrast, best linear unbiased prediction based on pedigree information is ineffective in selection within an F2 or backcross population because all individuals within the population have the same degree of relatedness based on pedigree (Bernardo, 2002, p. 234). This ineffectiveness of pedigree information in selection led to greater room for improvement, through markers linked to the QTL or markers for the QTL themselves, in inbred development than in hybrid development. Bernardo (2001) did not investigate the usefulness of known QTL in MARS. Because populations undergoing MARS lack pedigree relationships that can be exploited in best linear unbiased prediction, we surmise that having known QTL would enhance MARS in the same manner that it enhances inbred development.

In a preliminary study, Charcosset and Moreau (2004) found that having known QTL increased the expected efficiency of marker-assisted selection. This preliminary study involved a simple genetic model of 10 unlinked QTL with equal effects. In the current study, we examined the usefulness of having known QTL under genetic models that included different numbers of QTL, different levels of heritability, unequal gene effects, linkage, and epistasis. Our first objective was to determine whether knowing some or all of the QTL, as opposed to detecting unknown QTL through ad hoc significance tests, is advantageous in MARS. Our second objective was to determine whether knowing the QTL themselves, as opposed to knowing QTL through functionally neutral linked markers, is advantageous in MARS.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Prior Knowledge of Gene Locations
We considered two genetic models that differed in the correspondence between markers and QTL. In the Flanking Marker model, we assumed that a QTL was known through a linked marker to its left and a linked marker to its right. In the QTL Per Se model, we assumed that a QTL itself was represented by a marker. The Flanking Marker model represented the level of information expected from QTL mapping studies or QTL meta-analysis (Goffinet and Gerber, 2000). The QTL Per Se model represented the level of information expected from candidate gene analysis or genomics and post-genomics approaches for identifying the actual genes that control quantitative traits.

We considered a trait controlled by l = 10, 40, or 100 QTL in a linear metabolic pathway, which is described in the next section. The first QTL had the largest effect, the second QTL had the second largest effect, the third QTL had the third largest effect, and so on. For both the Flanking Marker and QTL Per Se models, the percentage of known QTL was P = 0%, 10%, 20%, 30%, ... , 100%. We assumed that the QTL with the largest effects were known first and that the QTL with the smallest effects were known last. For example, for l = 100 QTL, the first 10 QTL (i.e., the 10 QTL with the largest effects) were known for P = 10%, and the first 20 QTL were known for P = 20%. This assumption was for simplicity and was based on the argument that the QTL with large effects are likely to be discovered first. The percentage of the genetic variance (VG) accounted for by the known QTL was 20% (l = 10) to 24% (l = 100) for P = 10%; 39% (l = 10) to 41% (l = 100) for P = 20%; and 53% (l = 10) to 55% (l = 100) for P = 30%. For P > 30%, values at a given P were similar regardless of the number of QTL: 66% for P = 40%; 75% for P = 50%; 82% for P = 60%; 88% for P = 70%; 93% for P = 80%; 97% for P = 90%; and 100% for P = 100%.

When none of the QTL were known (P = 0%), markers with significant effects were detected by ad hoc statistical tests at significance levels of {alpha} = 0.20, 0.30, or 0.40. These {alpha} levels were consistent with previous studies indicating MARS is most efficient at relaxed significance levels (Hospital et al., 1997; Moreau et al., 1998).

Mapping Population, Genotypic and Phenotypic Values, and QTL Detection
Details of the procedures we used to simulate MARS were described in a previous article (Bernardo, 2004). We conducted 500 repeats of each simulation experiment and averaged the results across the repeats. Each repeat differed in the genetic map, the genotypes of the individuals sampled, and the phenotypic values.

A simulated F1 generation, formed by crossing two parental inbreds, was selfed to form an F2 population of N = 100, 200, or 400 plants. The F2 population was segregating at 100 codominant marker loci. The sizes of the chromosomes (ranging from 128 to 241 cM) and of the entire genome (1749 cM) corresponded to those in a published maize linkage map (Senior et al., 1996). The genome was divided into 100 bins of 1749/100 {cong} 17 cM. A marker was assumed randomly located within ±5 cM of the midpoint of each bin. In the Flanking Marker model, the l QTL were randomly located among the 10 chromosomes. In the QTL Per Se model, the position of a QTL corresponded to the exact position of a random marker. The first parent had the favorable allele at even-numbered QTL and the less-favorable allele at odd-numbered QTL. This procedure led to random coupling and repulsion linkages between QTL.

In each simulated cycle of selection, F2 (in cycle 0) or S0 individuals (in cycles 1–4) were selfed and the resulting F3 or S1 families were crossed to an unrelated inbred tester. Testcross genotypic values were simulated according to metabolic control theory (Bost et al., 1999). For the ith QTL, the enzyme activity for the favorable allele was mi + bi whereas the enzyme activity for the less favorable allele was mibi. The midparent enzyme activity was mi = ai/2, where the value of a for an effective number of loci equal to l was calculated using Eq. 11 of Lande and Thompson (1990). The value of bi, assuming a coefficient of variation of 0.15 relative to mi, was calculated as (0.15)mi{surd}2 (Bost et al., 1999). The enzyme activity of the heterozygote was equal to mi. The metabolic flux, which depended on the enzyme activities according to metabolic control theory (Kacser and Burns, 1981), was considered as the testcross genotypic value (Bost et al., 1999):

Formula
where Gk was the testcross genotypic value of the kth individual; Eik was the activity of the ith enzyme in the kth individual; and c was a constant which did not affect the relative values of Gk. To reduce rounding errors, c was assumed equal to the square of the number of QTL.

Under metabolic control theory, genes show physiological epistasis (Kacser and Burns, 1981; Cheverud and Routman, 1995) but additive variance accounts for a high percentage of VG (about 98% in this study, and 95 to 99% in Bost et al., 1999). Justification for the use of metabolic control theory is given in the Discussion.

Random nongenetic effects were added to the genotypic values to obtain testcross phenotypic values. The random nongenetic effects had a normal distribution with a mean of zero and were scaled so that broad-sense heritability among testcrosses was H = 0.20, 0.50, or 0.80 in the initial F2 population. The amount of nongenetic variance, for each level of H, was constant across cycles of selection.

For both the Flanking Marker model and QTL Per Se model, the effects associated with markers were estimated only in the initial F2 population (i.e., cycle 0). When none of the QTL were known (P = 0%), markers associated with the trait at a given {alpha} level were identified using a two-step process (Bernardo, 2004). First, multiple regression of phenotypic value on the number of marker alleles (0, 1, or 2) from the first parental inbred was performed on a chromosome-by-chromosome basis. Significant markers on each chromosome were identified by backward elimination. Second, multiple regression coefficients were obtained by jointly analyzing all the markers found significant in the per-chromosome analysis. When QTL were known (P > 0%), the pairs of markers that flanked the known QTL were included in multiple regression (without testing for their significance) in the Flanking Marker model. Likewise, the markers that corresponded to the known QTL themselves were included in multiple regression (without testing for their significance) in the QTL Per Se model. Standard procedures were used to handle any singularities encountered in multiple regression analysis (Press et al., 1992, p. 56).

When none of the QTL were known (P = 0%), the average power to detect QTL, false discovery rate, and ratio between the variance explained by the significant markers (VM; Lande and Thompson, 1990; Hospital et al., 1997) and VG were calculated (Bernardo, 2004).

Simulation of MARS
In phenotypic selection, the 10% of families with the highest testcross performance were random-mated to form the next cycle of selection. The following procedures were used in MARS. In cycle 0, a marker score (Mi; Lande and Thompson, 1990) for the ith family was calculated from the multiple regression coefficients for the markers with significant effects (for P = 0%) or for the known QTL (for P = 10 to 100%). This marker score was then combined with the family's testcross phenotypic value in a least-squares selection index (Ii) as outlined by Lande and Thompson (1990), but with the restriction that the weight for the phenotypic value was always positive (Hospital et al., 1997). The 10% of cycle 0 families with the highest Ii values (i.e., marker-assisted selection) were random-mated to form cycle 1. The cycle 1 families were then ranked according to Mi only (i.e., marker-based selection) and the best 10% of families were random-mated to form the next cycle 2. This marker-based selection procedure was repeated in cycles 2 and 3 of MARS. Selection responses, expressed in terms of units of the genetic standard deviation ({sigma}G) in cycle 0, were calculated for each cycle of phenotypic selection and MARS. Frequencies of the favorable allele at each QTL were calculated.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Proportions of Known QTL that Maximize the Response to MARS
When 10 QTL controlled the trait, the percentage of known QTL that led to the maximum response to MARS was PMax = 100% (Table 1). Knowing all 10 QTL led to the largest response to MARS regardless of the size of the F2 population (N = 100, 200, or 400), the heritability of the trait (H = 0.20, 0.50, 0.80), and whether the QTL were known through linked markers (Flanking Marker model) or the QTL themselves were represented by markers (QTL Per Se model). The responses to MARS decreased as the percentage of known QTL decreased. Consider a highly heritable trait (H = 0.80) controlled by 10 QTL and a population size of N = 100 for estimating the effects of the known QTL. For the QTL Per Se model, the response to MARS was reduced by more than half when P = 20 to 40% of the QTL (2–4 out of 10 QTL) were known compared with when P = 100% of the QTL (all 10 QTL) were known (Fig. 1a ). In this situation, favorable alleles at the two to four known QTL were fixed or nearly fixed (i.e., average allele frequencies of 0.92–1.00) in cycle 2 (results not shown). Consequently, little or no gain was achieved after cycle 2 when P = 20 to 40% of the 10 QTL were known (Fig. 1a). When all 10 QTL were known, average frequencies of the favorable alleles in cycle 4 ranged from 0.94 to 1.00 under the QTL Per Se model and from 0.81 to 0.95 under the Flanking Marker model (H = 0.80 and N = 100).


View this table:
[in this window]
[in a new window]
 
Table 1. Summary of responses to MARS under different levels of prior information on QTL location, assuming different numbers of QTL, population sizes (N), and trait heritabilities (H).

 

Figure 1
View larger version (19K):
[in this window]
[in a new window]
 
Fig. 1. Response (in number of genetic standard deviations) to phenotypic selection (PS) and MARS with different proportions of known QTL (P). Results for P = 10, 30, 50, 70, and 90% are not shown because of the closeness of the lines. Effects of the QTL were estimated with a population size of N = 100.

 
The values of PMax were often less than P = 100% when 40 or 100 QTL controlled the trait (Table 1). Consider the intermediate situation of a trait controlled by 40 QTL and with a heritability of H = 0.50. When effects of the QTL were estimated with a population size of N = 100, maximum responses to MARS were obtained when PMax = 60% of the QTL were known under the Flanking Marker model and when PMax = 80% of the QTL were known under the QTL Per Se model (Table 1). The PMax values were lowest when 100 QTL controlled the trait. Consider a complex trait controlled by 100 QTL and with a low heritability (H = 0.20). When effects of the QTL were estimated with a population size of N = 100, maximum responses to MARS were obtained when only PMax = 30% of the QTL were known under both the Flanking Marker and QTL Per Se models (Table 1). The PMax values increased, however, as the heritability increased from H = 0.20 to H = 0.80 and as the population size increased from N = 100 to N = 400 (Table 1).

When none of the QTL were known (P = 0%), the significance level for detecting QTL that led to the largest response to MARS was usually {alpha}Max = 0.20 when 10 QTL controlled the trait and {alpha}Max = 0.40 when 100 QTL controlled the trait (Table 1). However, the differences in the response to MARS at {alpha} = 0.20, 0.30, or 0.40 were generally small (results not shown). The number of QTL, heritability, and population size affected the ability to detect QTL. Consider a trait controlled by 10 QTL, a high heritability (H = 0.80), and a large population (N = 400) for detecting the unknown QTL. For the QTL Per Se model and a significance level of {alpha}Max = 0.20, the average power to detect QTL was 0.98; the average number of significant markers was 31 (out of 100); the average false discovery rate was 0.63 (i.e., about 20 of the 31 significant markers did not correspond to a QTL); and the average VM/VG ratio was 0.98. On the other hand, consider a complex trait controlled by 100 QTL, a low heritability (H = 0.20), and a small population (N = 100) for detecting the unknown QTL. For the Flanking Marker model and a significance level of {alpha}Max = 0.30, the average power was 0.40; the average number of significant markers was 37; the average false discovery rate was 0.15; and the average VM/VG ratio was 2.00.

Known QTL versus Ad Hoc Significance Tests to Detect QTL
The response obtained when some or all of the QTL were known (P > 0%) was not always greater than the response obtained when none of the QTL were known (P = 0%). Consider a trait controlled by 10 QTL, a heritability of H = 0.80, and a population size of N = 100. For the Flanking Marker model, the response to MARS for P > 0% was greater than the response for P = 0% only when the percentage of known QTL was ranged from P = 70 to 100% (i.e., RP > R{alpha}Max = 70–100 in Table 1). The RP > R{alpha}Max values in Table 1 indicated the bounds for which having known QTL was advantageous over having unknown QTL in MARS. Outside such bounds, the response to MARS was greater when QTL were unknown and, subsequently, QTL were detected using the appropriate {alpha}Max significance level. When 10 QTL controlled the trait, the lower bound of the RP > R{alpha}Max values, for both the Flanking Marker and QTL Per Se model, was smallest when the population size and heritability were both low (N = 100 and H = 0.20, Table 1). In this case, P was equal to 0.40 (i.e., RP > R{alpha}Max = 40–100), so that at least four out of the 10 QTL had to be known for prior knowledge of QTL locations to be useful. These four QTL accounted for 66% of the genetic variation for the trait (see Materials and Methods). When 40 or 100 QTL controlled the trait, the smallest lower bound of the RP > R{alpha}Max values was P = 0.30 (Table 1). For P = 0.30, the known QTL accounted for about 55% of the genetic variation for the trait. The lower bounds increased as both the population size and heritability increased regardless of the number of QTL controlling a trait.

The largest upper bound of RP > R{alpha}Max was always P = 100% when 10 or 40 QTL controlled the trait. But when 100 QTL controlled the trait, the upper bound of RP > R{alpha}Max was often less than P = 100%, particularly when QTL effects were estimated from smaller populations (N = 100 or 200) and heritability was low (H = 0.20; Table 1). Consider a complex trait controlled by 100 QTL and with a low heritability (H = 0.20). When effects of the QTL were estimated with a population size of N = 100, having known QTL under the Flanking Marker model was advantageous only within the bounds of RP > R{alpha}Max = 30–40 (Table 1). When less than P = 30% or more than P = 40% of the QTL were known, it was more advantageous (in terms of the response to MARS) to completely ignore the prior information on all known QTL and detect the QTL by ad hoc significance tests at {alpha}Max = 0.30. When all 100 QTL were known (P = 100%), MARS led to a negative response after cycle 1 under the QTL Per Se model (Fig. 1b). Under the Flanking Marker model, the responses to MARS after cycle 1 were small across a wide range of percentages of known QTL (Fig. 1c).

The advantage of having known QTL in MARS was quantified by the difference between the selection response at PMax (for P > 0%; solid circles in Fig. 2 ) and the selection response at {alpha}Max (for P = 0%; open circles in Fig. 2). Consider a simple trait controlled by few QTL (10) and with a high heritability (H = 0.80); a trait controlled by 40 QTL and with a heritability of H = 0.50; and a complex trait controlled by many QTL (100) and with a low heritability (H = 0.20). For population sizes typically used in a maize breeding program (N = 100), such advantage was greatest when a trait with a heritability of H = 0.50 was controlled by 40 QTL under the QTL Per Se model (Fig. 2). In this situation, the responses (in {sigma}G) at cycle 4 in MARS were 3.86 at PMax = 80% and 2.64 at {alpha}Max = 0.40. Knowing the exact locations of 32 out of the 40 QTL therefore led to an added response of 3.86 – 2.64 = 1.22, or (3.86 – 2.64)/2.64 = 46%. These gains were substantially higher than those for the Flanking Marker model. Knowing the flanking markers for 24 out of the 40 QTL (PMax = 60%, N = 100, H = 0.50; Table 1) led to an added response of only 0.37 {sigma}G or 17% (Fig. 2).


Figure 2
View larger version (17K):
[in this window]
[in a new window]
 
Fig. 2. . Response (in number of genetic standard deviations) to phenotypic selection ({triangleup}), MARS when the percentage of known QTL was PMax (•), and MARS when QTL were unknown and were detected by tests at the {alpha}Max significance level ({circ}). Effects of the QTL were estimated with a population size of N = 100.

 
The advantage of having known QTL (among the situations depicted in Fig. 2) was smallest for a trait controlled by 100 QTL and with a heritability of H = 0.20. In this situation, the selection responses under the Flanking Marker model were 1.07 at PMax = 30% and 1.00 at {alpha}Max = 0.30 (Fig. 2). Knowing the flanking markers for 30 out of the 100 QTL therefore led to an added response of only 1.07 – 1.00 = 0.07, or (1.07 – 1.00)/1.00 = 7%. Increasing the population size to N = 400 failed to increase the advantage of having known QTL. In this situation, the selection responses were 2.52 at PMax = 50% and 2.38 at {alpha}Max = 0.40; the added response was 2.52 – 2.38 = 0.14, or (2.52 – 2.38)/2.38 = 6%.

Phenotypic Selection versus MARS under the Flanking Marker and QTL Per Se Models
In maize, four cycles of phenotypic selection would require 8 yr whereas MARS would require 3 yr. If the amount of time required is considered, MARS (requiring 3 yr) should be compared to one or two cycles of phenotypic selection (requiring 2 to 4 yr) rather than to four cycles of phenotypic selection. Nevertheless, per-cycle comparisons indicated that the maximum response to MARS (at cycle 4) under the QTL Per Se model was usually greater than the response to phenotypic selection (at cycle 4). Specifically, MARS under the QTL Per Se model was superior to phenotypic selection in 18 out of the 27 combinations of number of QTL, heritability, and population size (MARS > PS in Table 1). This frequency of MARS > PS was greater than that for the Flanking Marker model, where MARS was superior to phenotypic selection in only 4 out of the 27 combinations. Phenotypic selection became superior to MARS as the number of QTL controlling the trait increased and as the population size decreased. Consider a trait controlled by 100 QTL, a heritability of H = 0.20, and a population size of N = 100. The response at cycle 4 to MARS assuming PMax known QTL (solid circles in Fig. 2) was only about 50 to 60% of the response to phenotypic selection (triangles in Fig. 2) under the Flanking Marker and QTL Per Se models.

For the three combinations of number of QTL and heritability in Fig. 2 (N = 100), the advantage of having markers for the QTL themselves over having markers linked to the QTL was greatest when 40 QTL controlled the trait (H = 0.50). In this situation, the responses at cycle 4 were 3.86 at PMax = 80% for the QTL Per Se model versus 2.57 at PMax = 60% for the Flanking Marker model (Fig. 2). The added response due to having markers for the QTL themselves was therefore 3.86 – 2.57 = 1.29, or (3.86 – 2.57)/2.57 = 50%. Underlying this 50% added response were higher frequencies of favorable alleles (at the known QTL) under the QTL Per Se model (average allele frequency of 0.76, PMax = 70%) than under the Flanking Marker model (average allele frequency of 0.69, PMax = 60%). The added response in terms of unit gain was smallest (0.32 {sigma}G) when 100 QTL controlled the trait (H = 0.20). The added response in terms of percentage of gain was smallest (20%) when 10 QTL controlled the trait (H = 0.80).


    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Genomics and post-genomics sciences will continue to increase our knowledge of the genes controlling quantitative traits. In this study, we attempted to answer two questions related to exploiting this increased knowledge in recurrent selection. First, is the response to MARS larger when QTL are known a priori than when QTL are unknown? Second, is the response to MARS larger when QTL themselves are known through markers (QTL Per Se model) than when QTL are known through linked markers (Flanking Marker model)?

Genetic Model
To do so, we conducted simulations of MARS under a genetic model that aimed to integrate current knowledge of the genetics of quantitative traits. These models were consistent with the empirical evidence of a few QTL with large effects and many QTL with small effects for complex traits (Kearsey and Farquhar, 1998; Bernardo, 2002, p. 310). The models included linkage and comprised different combinations of number of QTL and level of entry-mean heritability, which accounted for both within-environment error and genotype x environment interaction. The metabolic flux model we used is well known for modeling physiological epistasis (Cheverud and Routman, 1995) for a quantitative trait (Kacser and Burns, 1981; Keightley, 1989; Bost et al., 1999). Empirical data on gluconeogenesis in rat (Rattus sp.) liver cells (Groen et al., 1986), succinate oxidation in cucumber (Cucumis sativus L.) cotyledon mitochondria (Hill et al., 1993), and the TCA cycle in Dictyostelium discoideum (Albe and Wright, 1992) have been found consistent with metabolic control theory (Bost et al., 1999). We simulated a linear metabolic pathway only, although the general properties for a linear pathway can be extended to branched pathways (Kacser and Burns, 1981). Despite the presence of physiological epistasis, the genetic variation was mainly additive in the models we considered, with additive variance accounting for about 98% of VG. This result was comparable to those of Bost et al. (1999), who found that additive variance for a metabolic flux accounted for >95% of VG. Having largely additive variance even in the presence of physiological epistasis is expected when many loci control a trait (Crow, 1999), and summaries of empirical studies have shown that genetic variance for quantitative traits is primarily additive in self-pollinated crops (Moll and Stuber, 1974). We considered maize, which is a cross-pollinated crop for which estimates of epistatic variance are largely insignificant but for which dominance variance is significant (Moll and Stuber, 1974; Hallauer and Miranda, 1988, p. 119; Lamkey and Edwards, 1999). Dominance does not affect testcross genotypic values, which we simulated (Hallauer and Miranda, 1988, p. 24; Bernardo, 2002, p. 79). In summary, although we did not simulate any known pathway for the expression of a quantitative trait, the genetic models we considered encompassed key factors (i.e., number of QTL, size of gene effects, linkage, heritability, and epistasis) that underlie the genetic control complex traits.

Benefits and Possible Pitfalls of Having Known QTL in Recurrent Selection
We found that the response to MARS was larger when QTL are known a priori than when QTL are unknown, provided that the available information on known QTL is used judiciously. Our results indicated that prior knowledge of QTL locations is consistently useful in MARS when few QTL (e.g., 10) control the trait. When 10 QTL controlled the trait, the response to MARS was largest when all QTL were known, and the response decreased when fewer QTL were known. This result agreed with the preliminary findings of Charcosset and Moreau (2004). At least 40% (for N = 100 and H = 0.20) of the QTL (those with the largest effects) had to be known for the selection response to be greater than that when unknown QTL are detected by ad hoc significance tests. A higher percentage of the QTL need to be known when heritability and the population size increase, due to a higher power of QTL detection when QTL were unknown a priori.

When 40 or 100 QTL controlled the trait, at least 30% of the QTL (those with the largest effects) had to be known for prior information on QTL locations to be useful. Some of the gains were large, e.g., 1.22 {sigma}G or 46% when 32 out of the 40 QTL were known, heritability was H = 0.50, and QTL effects were estimated with a population size (N = 100) typically used in MARS (Johnson and Mumm, 1996; Johnson, 2001). However, contrary to when only 10 QTL controlled the trait, selection efficiency did not necessarily increase as the number of known QTL included in the model increased. When 100 QTL controlled the trait, heritability was low (H = 0.20), and the population size was N = 100, knowing all the QTL reduced the response to MARS. Knowing all the QTL was detrimental to the point that it was more advantageous to completely ignore all prior information on QTL locations and detect QTL by ad hoc significance tests.

The problem, as Bernardo (2001) and Johnson (2004) pointed out, is statistical: it is difficult to get good estimates of 100 QTL effects with a population size of N = 100. Consistently, the percentage of known QTL that maximized the response to MARS (PMax) was less than 100%. In other words it was often advantageous to exploit only the QTL with the largest effects (i.e., major QTL) and ignore those with small effects (i.e., minor QTL), even if the locations of the minor QTL were known. Exploiting only the major QTL also reduces the cost of marker genotyping. The PMax values were consistently highest when the population size was N = 400. This result indicated that that minor QTL can be exploited to a greater extent in MARS if the population size is increased. Breeders practice selection in several F2 or backcross populations at a time (Hallauer, 1990). If field resources and marker-genotyping resources are kept constant, an increase in the size of each population would require a decrease in the total number of F2 or backcross populations. Maize breeders generally prefer to select in several F2 or backcross populations, each with relatively few families, than in only a few F2 or backcross populations, each with many families (Hallauer, 1990; Bernardo, 2002, p. 13). Breeders need to overcome this preference and use larger population sizes for MARS to be most effective. If fewer populations, each with many families, are used in MARS, then the choice of parents of the F2 or backcross populations becomes crucial. In practice this implies that MARS would be used mainly for crosses among elite, proven inbreds rather than for speculative crosses.

The objective in MARS is to explain as much of the genetic variance as possible with a set of markers, while keeping within a reasonable level the noise due to false positives (i.e., when QTL are unknown) and/or poor estimation of QTL effects. This difference in primary objectives relative to detecting individual QTL explains why the conditions that maximize the efficiency of MARS when QTL are unknown a priori (i.e., relaxed significance levels of 0.20 to 0.40; Hospital et al., 1997) are different from those for mapping QTL (i.e., stringent significance levels). Having known QTL obviously eliminates the need to detect QTL in MARS and limits the statistical issue to the estimation of QTL effects. Here we simulated QTL that differed in the size of their effects and, in the Flanking Marker model, markers tightly linked to a minor QTL could have explained as much variation as markers loosely linked to a major QTL. Our results show that this uncertainty in estimating the effects of QTL has limited detriment in MARS when compared with the benefits of including the right markers in the model.

As previously mentioned in the Introduction, Bernardo (2001) investigated the usefulness of having known QTL in the context of (i) predicting the performance of untested single-crosses and (ii) marker-based selection among recombinant inbreds (without MARS) within an F2 population. He found that QTL effects estimated from the performance of single crosses among many inbreds were useful for identifying the best recombinant inbreds derived from an F2 population between two inbreds. Specifically, when the heritability among single crosses was H = 0.20, exploiting only the 30 QTL with the largest effects (out of 100) led to the highest efficiency relative to phenotypic selection among recombinant inbreds. Our current results therefore complement those of Bernardo (2001), as we have shown that having known QTL is useful also for improving an F2 population before selfing.

Benefits from Having Markers for the QTL Themselves
Our results showed that knowing markers that flank the QTL (Flanking Marker model) increases the response to MARS, and knowing markers for the QTL themselves (QTL Per Se model) increases the response further. We found that knowing markers for the QTL themselves was most beneficial for traits controlled by a moderately large number of QTL (e.g., 40). For a trait controlled by 40 QTL, knowing the QTL through flanking markers led to an added response of 0.37{sigma}G or 17% (PMax = 60%, H = 0.50, and N = 100). Having markers for the QTL themselves increased these gains to 1.29{sigma}G or 50% (PMax = 80%, H = 0.50, and N = 100). We observed clear and consistent increases in the frequency of favorable alleles. Recombination between a QTL and linked markers has been cited as a potential problem in MARS (Hospital et al., 1997). Knowledge from genomics and post-genomics sciences of the actual QTL will eliminate the loss, through recombination, of QTL information during the later cycles in MARS.

When few QTL (e.g., 10) control the trait, knowing the QTL through linked markers (Flanking Marker model) rapidly leads to a high frequency of favorable alleles in MARS. This leaves little room for further increases in the frequency of favorable alleles through the use of markers for the QTL themselves (QTL Per Se model). On the other hand, estimating the effects of many QTL (e.g., 100) with population sizes typically used in maize (N = 100) is problematic, not only because of the number of QTL whose effects are estimated but also because of the increased probability of having linked QTL as the total number of QTL increases. Even if markers are available for the QTL themselves, linkage causes the effect of a known QTL to be difficult to separate from the effects of other linked QTL.

Finally, we speculate that a combination of approaches for exploiting known QTL would be necessary. With continuing advances in genomics and post-genomics sciences, some QTL might be known through markers for the QTL themselves, some QTL might be known through linked markers, and some QTL might remain unknown. Known QTL could be directly accounted for in MARS while unknown QTL could be detected and accounted for by ad hoc significance tests using relaxed significance levels. Studies on jointly exploiting both the known QTL and unknown QTL are needed.


    ACKNOWLEDGMENTS
 
This research was conducted while Rex Bernardo was on a most pleasant sabbatical leave at the Moulon station in 2004. We thank the Institut National de la Research Agronomique for a grant that allowed this sabbatical leave.

Received for publication May 26, 2005.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 




This article has been cited by other articles:


Home page
The Plant GenomeHome page
C. Zhu, M. Gore, E. S. Buckler, and J. Yu
Status and Prospects of Association Mapping in Plants
The Plant Genome, July 1, 2008; 1(1): 5 - 20.
[Abstract] [Full Text] [PDF]


Home page
Crop Sci.Home page
Y. Xu and J. H. Crouch
Marker-Assisted Selection in Plant Breeding: From Publications to Practice
Crop Sci., March 19, 2008; 48(2): 391 - 407.
[Abstract] [Full Text] [PDF]


Home page
Crop Sci.Home page
A. R. Hallauer
History, Contribution, and Future of Quantitative Genetics in Plant Breeding: Lessons From Maize
Crop Sci., December 18, 2007; 47(Supplement_3): S-4 - S-19.
[Abstract] [Full Text] [PDF]


Home page
Crop Sci.Home page
J.-B. Veyrieras, L. Camus-Kulandaivelu, B. Gouesnard, D. Manicacci, and A. Charcosset
Bridging Genomics and Genetic Diversity: Linkage Disequilibrium Structure and Association Mapping in Maize and Other Cereals
Crop Sci., December 18, 2007; 47(Supplement_3): S-60 - S-71.
[Abstract] [Full Text] [PDF]


Home page
Crop Sci.Home page
J. A. Anderson, S. Chao, and S. Liu
Molecular Breeding Using a Major QTL for Fusarium Head Blight Resistance in Wheat
Crop Sci., December 18, 2007; 47(Supplement_3): S-112 - S-119.
[Abstract] [Full Text] [PDF]


Home page
Crop Sci.Home page
R. Bernardo and J. Yu
Prospects for Genomewide Selection for Quantitative Traits in Maize
Crop Sci., May 31, 2007; 47(3): 1082 - 1090.
[Abstract] [Full Text] [PDF]


Home page
Crop Sci.Home page
J. Wang, S. C. Chapman, D. G. Bonnett, G. J. Rebetzke, and J. Crouch
Application of Population Genetic Theory and Simulation Models to Efficiently Pyramid Multiple Genes via Marker-Assisted Selection
Crop Sci., March 1, 2007; 47(2): 582 - 588.
[Abstract] [Full Text] [PDF]


Home page
Crop Sci.Home page
R. Bernardo, L. Moreau, and A. Charcosset
Number and Fitness of Selected Individuals in Marker-Assisted and Phenotypic Recurrent Selection
Crop Sci., July 25, 2006; 46(5): 1972 - 1980.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (10)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Bernardo, R.
Right arrow Articles by Charcosset, A.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Bernardo, R.
Right arrow Articles by Charcosset, A.
Agricola
Right arrow Articles by Bernardo, R.
Right arrow Articles by Charcosset, A.
Related Collections
Right arrow Crop Genetics
Right arrow Maize


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
The SCI Journals Agronomy Journal Vadose Zone Journal
Journal of Natural Resources
and Life Sciences Education
Soil Science Society of America Journal
Journal of Plant Registrations Journal of
Environmental Quality
The Plant Genome