Published in Crop Sci. 43:1764-1773 (2003).
© 2003 Crop Science Society of America
677 S. Segoe Rd., Madison, WI 53711 USA
CROP BREEDING, GENETICS & CYTOLOGY
Comparison of Two Breeding Strategies by Computer Simulation
Jiankang Wang*,a,
Maarten van Ginkela,
Dean Podlichb,
Guoyou Yec,
Richard Trethowana,
Wolfgang Pfeiffera,
Ian H. DeLacyc,
Mark Cooperb and
Sanjaya Rajarama
a Wheat Program, CIMMYT, Apdo. Postal 6-641, 06600 Mexico, D.F., Mexico
b Pioneer Hi-Bred International Inc., 7300 N.W. 62nd Avenue, PO Box 1004, Johnston, IA 50131, USA
c School of Land and Food Sciences, The University of Queensland, Brisbane, Qld 4072, Australia
* Corresponding author (jkwang{at}cgiar.org).
 |
ABSTRACT
|
|---|
Breeding strategies used by plant breeders are many and varied, making it difficult to compare efficiencies of different breeding strategies through field experimentation. The objective of this paper was to compare, through computer simulation, two widely used breeding strategies, the modified pedigree/bulk selection method (MODPED) and the selected bulk selection method (SELBLK), in CIMMYT's wheat breeding program. The genetic models developed accounted for epistasis, pleiotropy, and genotype x environment (GE) interaction. The simulation experiment comprised the same 1000 crosses, developed from 200 parents, for both breeding strategies. A total of 258 advanced lines remained following 10 generations of selection. The two strategies were each applied 500 times on 12 GE systems. Findings indicated that genetic gain from SELBLK was on average 3.9% higher than that from MODPED, and genetic gain adjusted by target genotypes from SELBLK was on average 3.3% higher than MODPED for a wide range of genetic models. A greater proportion of crosses were retained (25% more) by means of SELBLK compared with MODPED, and from F1 to F8, SELBLK required one third less land than MODPED and produced fewer families (40% of the number for MODPED). For the genetic models considered in our study, computer simulations showed that the SELBLK method resulted in slightly greater genetic gain and significant improvements in cost effectiveness.
Abbreviations: B, CIMMYT's breeding location at El Batan, Mexico CIMMYT, Centro Internacional de Mejoramiento de Maiz y Trigo (International Maize and Wheat Improvement Center) GE, genotype x environment LR, leaf rust ME, megaenvironment ME1, the low rainfall and irrigated environment type for spring wheat MODPED, modified pedigree/bulk selection method QUCIM, a QU-GENE application breeding simulation module QU-GENE, a simulation platform for quantitative analysis of genetic models developed by The University of Queensland, Australia QUGENE, the engine of the QU-GENE SELBLK, selected bulk selection method SP, small plot T, CIMMYT's breeding location at Toluca, Mexico TG, target genotype TPE, target population of environments YR, yellow rust YT, yield trial
 |
INTRODUCTION
|
|---|
THE GLOBAL IMPACT of the wheat breeding program of the International Maize and Wheat Improvement Center (CIMMYT) has been significant and well documented (Rajaram, 1999). Many factors have contributed to CIMMYT's success, such as breeding targeted to megaenvironments (MEs), use of a diverse gene pool for crossing, and shuttle breeding (Rajaram et al., 1994; Rajaram, 1999). Another key factor, however, has been the breeding strategies adopted by CIMMYT breeders. A breeding strategy is defined as all crossing, seed propagation, and selection activities in an entire breeding cycle. A breeding cycle begins with crossing and ends at the generation when the selected advanced lines are returned to the crossing block as new parents.
The strategies used by CIMMYT breeders have evolved with time. Pedigree selection was used primarily from 1944 until 1985. From 1985 until the second half of the 1990s, the main selection method was a modified pedigree/bulk method (MODPED) (van Ginkel et al., 2002), which successfully produced many of the widely adapted wheats now being grown in the developing world. This method was replaced in the late 1990s by the selected bulk method (SELBLK) (van Ginkel et al., 2002) in an attempt to improve resource-use efficiency. The major differences between MODPED and SELBLK are outlined below.
The MODPED method begins with pedigree selection of individual plants in the F2 followed by three bulk selections from F3 to F5, and pedigree selection in the F6, hence the name modified pedigree/bulk. In the SELBLK method, spikes of selected F2 plants within one cross are harvested in bulk and threshed together, resulting in one F3 seed lot per cross. This selected bulk selection is also used from F3 to F5, while pedigree selection is used only in the F6. A major advantage of SELBLK compared with MODPED is that fewer seed lots need to be harvested, threshed, and visually selected for seed appearance. In addition, significant savings in time, labor, and costs associated with nursery preparation, planting and plot labeling ensue, and potential sources of error are avoided (van Ginkel et al., 2002). Although some small-scale field experiments have been conducted comparing the efficiencies of these breeding strategies (Singh et al., 1998), the efficiency of SELBLK compared with that of MODPED remains untested on a larger scale.
Quantitative genetics provides much of the framework for the design and analysis of selection methods used within breeding programs (Allard, 1960; Falconer and Mackay, 1996; Cooper et al., 1999). However, there are usually associated assumptions, some of which can be easily tested or satisfied by experimentation; others can seldom, if ever, be met. Computer simulation provides us with a tool to investigate the implications of relaxing some of the assumptions and the effect this has on the conduct of a breeding program. QU-GENE, a simulation platform for quantitative analysis of genetic models, was developed for this purpose (Podlich and Cooper, 1998). It has been used to compare efficiencies of different breeding strategies (Cooper et al., 2002) and modifications to existing selection strategies (Podlich et al., 1999), and to conduct a power analysis of the joint segregation analysis of the mixed inheritance model for quantitative traits (Wang et al., 2001).
The QU-GENE simulation platform consists of a two-stage architecture (http://pig.ag.uq.edu.au/qu-gene; verified 10 April 2003). The first stage is the engine (referred to as QUGENE), and its role is to: (i) define the GE system (i.e., all the genetic and environmental information of the simulation experiment), and (ii) generate the starting population of individuals (base germplasm). The second stage encompasses the application modules, whose role it is to investigate, analyze, or manipulate the starting population of individuals within the GE system defined by the engine. The application module will usually represent the operation of a breeding program (Podlich and Cooper, 1998). A QU-GENE strategic application module, QUCIM, has therefore been developed to simulate CIMMYT's wheat breeding procedure, with the aim of understanding why CIMMYT's wheat breeding effort has been so successful and finding ways to improve its efficiency further.
The objective of our research was to conduct a simulation experiment in which the engine QUGENE and the application module QUCIM were used to compare CIMMYT's MODPED and SELBLK methods in terms of genetic gain, number of crosses retained after one breeding cycle, and resource allocation.
 |
MATERIALS AND METHODS
|
|---|
Two programs (QUGENE and QUCIM) and two input files (one for the QU-GENE engine and one for the QUCIM module) are required to run the simulation experiment. The first input file contains all the information needed to define a GE system and the population of genotypes to which the breeding strategies will be applied. The second file contains all the crossing and selection information required to define the breeding strategies. The genetic and environmental information used to construct these files and the criteria used to compare breeding strategies are described below.
Genotype x Environment System
The GE system underlies the genetic and environmental model framework for simulation experiments. Information about a GE system includes the target population of environments (TPE) for the breeding program, breeding traits and their associated phenotypic errors, genes and their degree of linkage, and genes and their effects on phenotype in different environment types. The TPE consists of a set of different environment types, each with a frequency of occurrence. Each environment type has its own gene action and gene interaction, which provides the framework for defining GE interactions. A specific GE system that fits CIMMYT's germplasm and breeding objectives is required to simulate CIMMYT's wheat breeding program. The breeding program targeted to megaenvironment 1 (ME1) (low rainfall and irrigated environments for spring wheat; Rajaram et al., 1994) will be the primary focus of this paper. While not all the details of the GE system are available at this stage, reasonable approximations of the critical features can be made.
There are three key Mexican locations involved in CIMMYT's wheat breeding effort targeted to ME1: Cd. Obregon [27° N, 39 m above sea level (masl)], Toluca (19° N, 2640 masl), and El Batan (19° N, 2300 masl) (Fig. 1). Cd. Obregon is an arid, irrigated location whose growing season conditions are similar to those of many irrigated environments around the world. Yield trials for materials targeted to ME1 are conducted only at Cd. Obregon. Precipitation in Toluca is high (about 800 mm during the summer crop cycle), providing conditions favorable for foliar diseases, including the rusts and foliar blights. Precipitation at El Batan is more erratic, with an annual average of 500 to 600 mm; irrigation facilities are available when needed. El Batan is used mainly for leaf rust screening and small-scale seed increases. The TPE for the breeding program targeting ME1 consists of the Cd. Obregon environment type at a frequency of 1.0 and the Toluca and El Batan types, both at a frequency of 0.0.

View larger version (40K):
[in this window]
[in a new window]
|
Fig. 1. Germplasm flow for single crosses made in Toluca and targeted to ME1 (low rainfall and irrigated environment for spring wheat). MODPED, modified pedigree/bulk selection method. SELBLK, selected bulk selection method.
|
|
Ten major traits are considered for among-family and within-family selection: grain yield, lodging, stem rust, leaf rust, stripe rust, height, tillering, days to heading, grains per spike, and 1000-kernel weight. The estimated numbers of genes controlling these traits are given in Table 1. Only those genes that would segregate in crosses between the defined parental stocks are considered. For yield, we have little knowledge about the number of genes and their effects on phenotype. Hence two levels of gene number are considered: 20 and 40. Their effects were generated by an ensemble approach (Kauffman, 1993) that samples the effects of the genes from a specified statistical distribution (Cooper and Podlich, 2003). Here the uniform distribution from 0 to 1 is used because there is no information available on the distribution of effects. The yield gene effects are assigned as QUCIM is running. Three levels of epistasis among yield genes are considered: no epistasis, digenic epistasis, and trigenic epistasis. The effects of genes on other traits are assumed to be fixed and additive (Table 1). Dominance is less important in breeding for self-pollinated crops (van Oeveren and Stam, 1992) and was not considered in this study.
Pleiotropic gene effects are assumed to cause the correlation between two traits. Linkage can also give rise to a correlation between traits, but is not considered because there is little linkage information available. As an example, the correlation between yield and lodging is estimated at 0.5 by CIMMYT breeders (Table 2). This negative correlation assumes that all three lodging genes have some negative effects on yield. The three yield components (tillering, grains per spike, and 1000-kernel weight) are negatively correlated to each other to a degree; however, they are all positively correlated to yield. We can easily build a GE system with negative correlations among the three yield components, but allowing the GE system to have a positive correlation between yield and the three yield components is not as simple. Therefore, in designing the GE system, only the negative correlation among all the three yield components is considered, not their positive correlations with yield (Table 2). In fact, the trait correlation changes following selection and depending on the population reference used.
View this table:
[in this window]
[in a new window]
|
Table 2. Correlation coefficient matrix among traits at Cd. Obregon estimated from CIMMYT's breeders (upper triangle) and genetic correlation coefficient matrix estimated from one simulated GE system (lower triangle) with 20 yield genes and digenic epistasis and one population with all gene frequencies at 0.50 and population size 200.
|
|
Breeding Strategy
In CIMMYT's wheat breeding program, the best advanced lines developed from the F10 generation will be returned to the crossing block to be used for new crosses, so a new breeding cycle starts after F10 leaf rust screening at El Batan (Fig. 1). The number of generations in one breeding cycle is 10 for both breeding strategies. There may be more than one round of selection for some generations, such as the F7 generation and the small plot evaluation in the F8 generation (F8(SP)) (Fig. 1; Table 3). The F7 is taken as an example. Once an advanced line is selected from among F7 head-rows, the seed lot is split three ways. A reserve is kept for sowing yield trials at Cd. Obregon the following winter cycle (F8(YT)); of the other two sets, one is sown at Toluca (F8(T)) and another at El Batan (F8(B)) during the summer for disease evaluation in the field. The composition of the yield trial in the F8 generation (F8(YT)) at Cd. Obregon is determined on the basis of disease reactions at the two summer locations [F8(T) and F8(B)]. So, in fact, the F7 generation is subjected to four rounds of selection: one among F7 lines, two based on field tests at both Toluca and El Batan, and one based on F8 yield performance at Cd. Obregon. Since the seed of the two F8 field tests and the F8 yield trial are derived from the selected F7 lines, the indicator of the seed source is 0 for the F7 (Table 3). For those generations subjected to just one round of selection, no indicator for the seed source is required.
Among-family selection and within-family selection are distinct for each generation in a breeding strategy. For the F1 or F2, each family is derived from one cross. One family in the F3 is also derived from a distinct cross if bulk selection is used in the F2, but from one individual plant if pedigree selection is used. The traits for both among-family and within-family selections can be the same or different, as is the case for selected proportions (Table 4).
View this table:
[in this window]
[in a new window]
|
Table 4. Among-family and within-family selected proportions and selection methods for each trait in the F1 to F3 generations.
|
|
Phenotypic Value of a Genotype and Family Mean of a Family
For the purposes of simulation, the genotypic value of a genotype can be calculated from the definition of gene actions in the GE system. However, breeders select using the phenotypic value. Therefore, the phenotypic value of a genotype in a specific environment needs to be defined from its genotypic value and some associated environmental errors. For example, if we have n plots (or replications) for a family and the plot size is m, there will be n x m individual plants (or genotypes) for this family. The genotypic value gij (i = 1,...n; j = 1,...m) can be defined from the GE system and the phenotypic value pij can then be calculated from the formula pij = gij + ebi + ewij, where ebi is the between plot error for plot i and ewij is the within-plot error for the genotype j in the plot i, and both ebi and ewij are assumed to be normally distributed. The variance (
2e) of ewij is calculated from the definition of heritability in the broad sense h2b =
2g/
(Table 1), where the genetic variance (
2g) is calculated from the genotypic values of individuals in the initial population. Once the error variance is determined, it will be used for all generations without change. The genetic variance changes generation to generation. So the heritability may be different in different generations. In this simulation experiment, the variance of ebi is set to be half of
2e. So once the genotypic value of a genotype has been defined, a random effect for between plot error from the distribution N(0, 0.5
2e) and a random effect for within-plot error from the distribution N(0,
2e) will be added to the genotypic value gij to give the phenotypic value pij. The family mean can also be calculated from pij. QUCIM then simulates within-family selection from phenotypic values and among-family selection from family means. For multiple traits, independent selection will be used for both within-family and among-family selections.
Experimental Design
A set of three files (one for GE system, one for initial population, and one for breeding strategies) is required to run QUCIM. The first two files are the two output files generated after running QUGENE. The other file defines the breeding strategies to be applied on the GE system and the initial population. Twelve combinations are considered in the experiment.
- GE system: Because of the lack of information available to define a real GE system, different GE systems are used, in which two levels of yield gene number (20 and 40), three levels of epistasis for yield genes (no epistasis, digenic epistasis, and trigenic epistasis), and two levels of pleiotropy (absent and present) will be considered for simulation, giving 12 GE systems.
- Initial population: One initial population comprised of 200 homozygous genotypes (parents) is used, and all gene frequencies in the initial population are set at 0.5.
- Breeding strategies: Both MODPED and SELBLK are defined in one file. To make proper comparisons, the two breeding strategies start from the same population (or germplasm) and finish with a similar number of selected lines after a breeding cycle. The same 1000 crosses are made for both strategies.
- Models, runs, and cycles: The advantage of simulation is that the same breeding strategy can be repeated many times (called runs in this paper) for different genetic models. The different results from runs are a consequence of the stochastic nature of the breeding process. The effects of the yield genes are defined as random effects sampled from the uniform distribution (Kauffman, 1993; Cooper et al., 2002). Fifty models are considered for various yield gene effects and 10 runs for each model. One breeding cycle may be enough to compare two strategies, although QUCIM can run any number of breeding cycles. Therefore, in the 12 sets, QUCIM was run for one breeding cycle for 50 models (50 different yield genetic effects randomly assigned from the uniform distribution) and 10 runs (or replications).
Criteria Used to Compare Efficiencies of Different Breeding Strategies
Genetic gain in yield is the major criterion used to compare different breeding strategies. During simulation, QUCIM records the genotype of each individual in a population. From the genotype and the GE system, QUCIM defines the genotypic values of an individual in the TPE and all environment types in the TPE. In this paper, fitness is used to represent the genotypic value of a genotype or the mean genotypic value of a population in any environment type or the TPE. The difference in fitness before and after a breeding cycle is the genetic gain. However, when breeding strategies are compared under a wide range of GE systems, different scales in different GE systems make it inappropriate to compare genetic gain on the basis of the original scales of the fitness values. We therefore provide a standardized genetic gain: the genetic gain adjusted by target genotypes. Once a GE system and all gene effects in it have been defined, the best target genotype (with the highest fitness among all possible genotypes) and worst target genotype (with the lowest fitness among all possible genotypes) in the GE system can be defined. The fitness adjusted by target genotypes is then used to measure the distance of a genotype or a population from the worst target genotype in the GE system, and the distance from the worst target genotype to the best target genotype is set to 100.00. The genetic gain adjusted by target genotypes can be used to compare the efficiencies from different breeding strategies across a wide range of models differing in scale of genotypic values. Supposing F and F' are the fitness of a population before and after selection, respectively, then the genetic gain (
G) is
G = F' - F. Assuming that TGl and TGh are the genotypic values of the two extreme target genotypes, then the fitness adjusted by target genotypes (Fad) is
 |
and the genetic gain adjusted by target genotypes (
Gad) is
The adjusted genetic gain scales the gain relative to the extreme genotypes possible in the GE system and is particularly useful as a unit measure when different epistasis levels and gene numbers are included.
Genetic gain adjusted by target genotypes will be used in this paper mainly to compare the breeding strategies, but the number of crosses retained after selection and some economic factors are also considered.
 |
RESULTS AND DISCUSSION
|
|---|
Genetic Gain Adjusted by Target Genotypes
QUCIM can compute an estimate of genetic gain for every trait defined in the GE system. In this study, only the results for yield are examined. However, since secondary traits such as rust resistance, days to heading, and height, are all correlated with yield to some degree (Table 2), they are used in the simulation experiments to define a more realistic GE system. Because of the scale effects, the genetic gain adjusted by target genotypes (hereafter abbreviated as adjusted gain) will be used primarily for comparison.
When the 12 sets (one set is one yield gene number x epistasis level x pleiotropy level combination) were considered individually, the adjusted gains from the two breeding strategies were significantly different among models, but generally not among runs, except for set 10 (Table 5). This means a large number of models (normally more than 30) and a smaller number of runs (normally 10) should be used in simulation. This emphasizes the importance of using a wide range of genetic models in any comparison of breeding strategies using computer simulation. Significant (P < 0.05) differences between breeding strategies were noted in some sets (sets 2, 4, 6, 8 and 10), but not all. These were all cases where pleiotropy was present in the genetic model. The different selection pressures that were applied to the traits for the MODPED and SELBLK (Table 4) resulted in a significant difference in the adjusted gain for yield in the presence of the pleiotropic effects of these traits on yield (Table 5). In the absence of pleiotropic effects, there were no significant differences between the breeding strategies. However, for set 5 the breeding strategies are significantly different at P = 0.054; in this case the MODPED had a higher adjusted genetic gain than SELBLK. For those sets where adjusted gains are significantly different, the adjusted gain from SELBLK is always higher than that from MODPED. This means the SELBLK is at least equivalent to or better than MODPED in terms of adjusted gain for the genetic models considered in this study. When all sets are considered together, the adjusted gains were significantly different among or between experiment sets, models, and breeding strategies, but not among runs (Table 6). In the 12 sets, there are two yield gene numbers, three epistasis levels and two pleiotropy levels (Table 5). The adjusted gains are significantly different for all three factors. When the nested effect model was considered, significant differences were found between breeding strategies, breeding strategies in models, and model by strategy interactions in experimental sets. The existence of model by strategy interaction indicates that the question of which strategy is better depends on the model used. In most GE systems, SELBLK has higher adjusted gains in more models than MODPED. However, the reverse is true in GE systems with trigenic epistasis but no pleiotropy (Table 7).
View this table:
[in this window]
[in a new window]
|
Table 5. Genetic gain adjusted by target genotypes across 50 models and 10 runs in each set and the test of difference.
|
|
View this table:
[in this window]
[in a new window]
|
Table 7. Number of models in each GE system where selected bulk selection method has higher genetic gains across the 10 runs.
|
|
The average adjusted gain is 5.83 for MODPED and 6.02 for SELBLK a difference of 3.3%. (Table 8; Fig. 2a). This difference is not large and therefore unlikely to be detected in field experiments (Gill et al., 1995; Singh et al., 1998). However, it can be detected through simulation, which indicates that the high level of replication (50 models by 10 runs in this experiment) feasible with simulation can better account for the stochastic properties of a run of a breeding strategy and for the sources of experimental errors. The average adjusted gains for the two yield gene numbers 20 and 40 are 6.83 and 5.02, respectively (Table 8), suggesting that genetic gain decreases with increasing yield gene number. The average adjusted gains were 6.70 for no epistasis, 5.36 for digenic epistasis, and 5.71 for trigenic epistasis (Table 8), which indicates that epistasis will reduce the adjusted gain. The adjusted gain associated with the absence of pleiotropy is also higher than that for the presence of pleiotropy (Table 8). These results show that the increase in gene number and the presence of epistasis and pleiotropy make it more difficult for a breeding strategy to identify the trait performance level of the best genotype in the defined GE system. When the experimental factors are considered individually, the adjusted gain from SELBLK is always significantly higher than that from MODPED, except in the absence of pleiotropy (Table 9), indicating SELBLK is at least equivalent to or better than MODPED.

View larger version (38K):
[in this window]
[in a new window]
|
Fig. 2. Results from the simulation experiment. (a) Adjusted genetic gain after one breeding cycle across all experimental sets. (b) Number of crosses after each generation's selection across all experimental sets. (c) Number of families in each generation in one breeding cycle. (d) Number of individual plants in each generation in one breeding cycle. F8(T), F8 field test at Toluca; F8(B), F8 field test at El Batan; F8(YT), F8 yield trial at Cd. Obregon; F8(SP), F8 small plot evaluation at Cd. Obregon; F9(T), F9 field test at Toluca; F9(B), F9 field test at El Batan; F9(YT), F9 yield trial at Cd. Obregon; F9(SP), F9 small plot evaluation at Cd. Obregon; F10(YR), F10 stripe rust screening at Toluca; F10(LR), F10 leaf rust screening at El Batan.
|
|
View this table:
[in this window]
[in a new window]
|
Table 9. The average genetic gains adjusted by target genotypes of SELBLK and MODPED for each experimental factor level.
|
|
Number of Crosses Remaining after Selection
The same 1000 crosses were made for both breeding strategies and 258 advanced lines were selected after a breeding cycle, regardless of the GE system used. The number of crosses remaining after one breeding cycle is significantly different among models and strategies, but not among runs (Table 10). The number of crosses remaining from SELBLK is always higher than that from MODPED, which means that delaying pedigree selection favors diversity. On average, 30 more crosses were maintained in SELBLK (Fig. 2b). However, there is a crossover between the two breeding strategies (Fig. 2b). Before F5, the number of crosses in MODPED is higher than that in SELBLK. The number of crosses becomes smaller in MODPED after F5 when pedigree selection is applied in F6. Among-family selection from F1 to F5 in SELBLK is equal to among-cross selection, and results in a greater reduction in cross number for SELBLK compared with MODPED in the early generations. In general, only a small proportion of crosses remains at the end of a breeding cycle (11.8% for MODPED and 14.8% for SELBLK); therefore, intense among-cross selection in early generations is unlikely to reduce the genetic gain. On the contrary, breeders will tend to concentrate on fewer but "higher probability" crosses (Simmonds, 1996). That just a few crosses of the many generated remain after the final yield trial stage is common in most breeding programs. Since more crosses remain in SELBLK, the population following selection from SELBLK may have larger genetic diversity than that from MODPED. In this context, SELBLK is also superior to MODPED.
View this table:
[in this window]
[in a new window]
|
Table 10. Average number of crosses retained after each generation's selection across 50 models and 10 runs for each set.
|
|
Resource Allocation
Since the number of families and selection methods after F8 are basically the same for both MODPED and SELBLK, only the resources allocated from F1 to F8 are compared. The total number of individual plants from F1 to F8 was calculated to be 5 155 090 for MODPED and 3 358 255 for SELBLK (Fig. 2d). Assuming that planting intensity is similar, SELBLK will use approximately two thirds of the land allocated to MODPED. Furthermore, SELBLK produces a smaller number of families compared with MODPED (Fig. 2c). From F1 to F8, there are 63 188 families for MODPED but only 24,260 for SELBLK, approximately 40% of the number for MODPED. Therefore when SELBLK is used, fewer seed lots need to be handled at both harvest and sowing, resulting in significant savings in time, labor, and cost.
The GE System and Its Test
In field-based breeding, the breeder selects for phenotype. However, in simulation the genotype must be defined. The genotypic value of the genotype can be calculated from the definition of gene actions in the GE system (Fasoula and Fasoula, 1997; Mackay, 2001).The phenotypic value and family mean can be found from the genotypic value and its associated error (environmental deviation). QUCIM then conducts within-family selection from phenotypic values and among-family selection from family means. A sensible definition of the GE system is thus essential to any such simulation, since it determines the phenotypic value of a genotype and then the phenotypic mean of a population to which selection is applied. However, given the current state of our knowledge of gene-to-phenotype relationships for complex traits, it is difficult to define comprehensively a real GE system. It is therefore not possible to ensure that the GE systems used in this simulation experiment match the biophysical systems within which CIMMYT's wheat breeding program operates. For this reason, we created more than one GE system in which to compare the two strategies and considered performance of the strategies across an ensemble of GE systems. Nevertheless, a more comprehensive definition of the GE system is still required, especially for tactical questions in breeding.
Fortunately, some information about the real GE system can be acquired from simulation. For example, in the case of yield gene number, the average population fitness before selection is 8.95 for all sets with 20 yield genes and 18.96 for all sets with 40 yield genes. Thus the percentage genetic gain is 15.6 for 20 yield genes and 9.1 for 40 yield genes. It's not easy to calculate the genetic gain in practice. Usually, all generations in Table 3 appear in one planting season. However, the relative genetic gain per year was estimated at 0.9% (Rajaram, 1999) for CIMMYT's wheat breeding program, and the genetic gain in percentage over the top parent was 5.6 in Singh et al. (1998). So the numbers of yield genes used in this experiment seem to be smaller than the actual number in CIMMYT's wheat breeding program. The population used in the simulation experiment has the largest potential genetic variation for additive genes because of their gene frequencies of 0.5. But the gene frequencies in a real breeding population can be quite different from 0.5. Some genes are close to being fixed and have high gene frequencies after a few cycles of selection; some genes have low gene frequencies due to their initial introduction. So the genetic gain in an actual breeding program may be much smaller than that in the current experiment. Linkage may also affect genetic gain, but was not considered in this paper. In this sense the yield gene number 40 used in this study may be a better approximation of the real yield gene number in CIMMYT's wheat breeding program.
In the future it will be possible to build more realistic GE systems if advances in genomics improve our understanding of the genotype-to-phenotype relationship and GE interactions (Cooper et al., 1999; Bernardo, 2001). Conclusions on the relative merits of breeding strategies based on simple gene-to-phenotype models may have to be reevaluated in the context of an exponentially growing knowledge base. This information will aid in determining gene number and gene effects on phenotype. In addition, conventional plant breeding provides a wealth of information about trait heritabilities and trait correlations. This information, once determined, will help define errors, linkage, and pleiotropic effects in a GE system.
 |
CONCLUSIONS
|
|---|
The object of hybridization in breeding self-pollinated species is to combine, in a single genotype, desirable genes that are found in two or more different genotypes (Allard, 1960; Jensen, 1988). Pedigree and bulk breeding are the two most widely used methods. The pedigree method allows the breeder to keep track of the ancestry of individuals. However, the number of families to be handled increases rapidly from the F3 generation onwards, and results in greater land, labor, and bookkeeping requirements. Bulk breeding makes no attempt to keep track of the ancestry of individuals and the number of families is much smaller compared with the pedigree method. However, the bulk method also maintains undesirable genotypes in the advanced generations as a result of low within-family selection intensity (Baenziger and Peterson, 1992). Many modifications of the pedigree and bulk systems have been proposed and studied (Fehr, 1987; Jensen, 1988; Baenziger and Peterson, 1992). However, it is difficult to say which breeding strategy is better in the context of a large breeding program.
The simulation experiment using QUGENE and QUCIM reported here showed that SELBLK is significantly superior to MODPED in genetic gain for the genetic models used in the simulation experiment, even though the adjusted gain from SELBLK was just 3.3% higher than that from MODPED across all models. Such a small difference is difficult to detect through field experimentation. For example, Gill et al. (1995) and Singh et al. (1998) found no significant differences between MODPED and SELBLK. Therefore, based on the results of this simulation study and available experimental evidence, the adoption of SELBLK is unlikely to reduce the genetic advance in yield. In addition, the greater number of crosses retained in SELBLK compared to MODPED leads to greater genetic diversity in resultant populations, which can be an advantage. Finally, SELBLK uses less land than MODPED, and the number of families in SELBLK is much smaller, thereby improving cost-effectiveness.
QU-GENE provides a flexible way to define a GE system with linkage, epistasis, multiple alleles, pleiotropy, molecular markers, and genotype by environment interaction (Podlich and Cooper, 1998). QUCIM, a QU-GENE application module, provides a flexible way to define a complicated breeding strategy (Tables 3 and 4). Two breeding strategies used in CIMMYT's wheat breeding program are defined here as an example. QUCIM can be used to simulate other breeding programsincluding non-wheat and even cross-pollinated cropsby modifying the two input files for GE system and breeding strategy. Modifying the GE system definition file will define another GE system for another breeding program and another crop, while modifying the strategy definition file will define other breeding strategies. New simulation experiments are being designed to estimate the efficiency of current breeding strategies, particularly in situations where field experimentation is difficult or expensive.
 |
ACKNOWLEDGMENTS
|
|---|
This project was supported in part by the Grains Research and Development Corporation (GRDC) of Australia.
 |
NOTES
|
|---|
This project was supported in part by the Grains Research and Development Corporation (GRDC) of Australia.
Received for publication May 30, 2002.
 |
REFERENCES
|
|---|
- Allard, R.W. 1960. Principles of plant breeding. John Wiley & Sons, New York.
- Baenziger, P.S., and C.J. Peterson. 1992. Genetic variation: Its origin and use for breeding self-pollinated species. p. 6992. In H.T. Stalker and J.P. Murphy (ed.) Plant breeding in the 1990s, proceedings of the symposium on plant breeding in the 1990s. CAB International, Wallingford, UK.
- Bernardo, R. 2001. What if we knew all the genes for a quantitative trait in hybrid crops? Crop Sci. 41:14.[Abstract/Free Full Text]
- Cooper, M., and D.W. Podlich. 2003. The E(NK) model: Extending the NK model to incorporate gene-by-environment interactions and epistasis for diploid genomes. Complexity 7:3147.
- Cooper, M., D.W. Podlich, N.W. Jensen, S.C. Chapman, and G.L. Hammer. 1999. Modelling plant breeding programs. Trends Agron. 2:3364.
- Cooper, M., S.C. Chapman, D.W. Podlich, and G.L. Hammer. 2002. The GP problem: Quantifying gene-to-phenotype relationships. In Silico Biol. 2:151164.[Medline]
- Falconer, D.S., and T.F.C. Mackay. 1996. Introduction to quantitative genetics. 4th ed. Longman, Essex, UK.
- Fasoula, D.A., and V.A. Fasoula. 1997. Gene action and plant breeding. Plant Breed. Rev. 15:315375.
- Fehr, W.R. 1987. Principles of cultivar improvement. Vol. 1. Theory and technique. Macmillian Publishing Company, New York.
- Gill, J.S., M.M. Verma, R.K. Gumber, and J.S. Brar. 1995. Comparative efficiency of four selection methods for deriving high-yielding lines in mungbean [Vigna radiata (L.) Wilczek]. Theor. Appl. Genet. 90:554560.
- Jensen, N.F. 1988. Plant breeding methodology. John Wiley & Sons, New York.
- Kauffman, S.A. 1993. The origins of order: self-organization and selection in evolution. Oxford University Press, New York.
- Mackay, T.F.C. 2001. The genetic architecture of quantitative traits. Annu. Rev. Gen. 35:303339.[ISI][Medline]
- Podlich, D.W., and M. Cooper. 1998. QU-GENE: a platform for quantitative analysis of genetic models. Bioinformatics 14:632653.[Abstract/Free Full Text]
- Podlich, D.W., M. Cooper, and K.E. Basford. 1999. Computer simulation of a selection strategy to accommodate genotype-environment interaction in a wheat recurrent selection programme. Plant Breed. 118:1728.
- Rajaram, S. 1999. Historical aspects and future challenges of an international wheat program. p. 117. In M. van Ginkel et al. (ed.) Septoria and Stagonospora diseases of cereals: A compilation of global research. CIMMYT, Mexico, D.F.
- Rajaram, S., M. van Ginkel, and R.A. Fischer. 1994. CIMMYT's wheat breeding mega-environments (ME). p. 11011106. In Proceedings of the 8th international wheat genetics symposium. China Agricultural Scientech. Beijing, China.
- Simmonds, N.W. 1996. Family selection in plant breeding. Euphytica 90:201208.
- Singh, R.P., S. Rajaram, A. Miranda, J. Huerta-Espino, and E. Autrique. 1998. Comparison of two crossing and four selection schemes for yield, yield traits, and slow rusting resistance to leaf rust in wheat. Euphytica 100:3543.
- van Ginkel, M., R. Trethowan, K. Ammar, J. Wang, and M. Lillemo. 2002. Guide to bread wheat breeding at CIMMYT (rev). Wheat Special Report No. 5. CIMMYT, Mexico, D.F.
- van Oeveren, A.J., and P. Stam. 1992. Comparative simulation studies on the effects of selection for quantitative traits in autogamous crops: early selection versus single seed decent. Heredity 69:342351.
- Wang, J., D.W. Podlich, M. Cooper, and I.H. DeLacy. 2001. Power of the joint segregation analysis method for testing mixed major-gene and polygene inheritance models of quantitative traits. Theor. Appl. Genet. 103:804816.
This article has been cited by other articles:

|
 |

|
 |
 
Y. Xu and J. H. Crouch
Marker-Assisted Selection in Plant Breeding: From Publications to Practice
Crop Sci.,
March 19, 2008;
48(2):
391 - 407.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. Wang, S. C. Chapman, D. G. Bonnett, G. J. Rebetzke, and J. Crouch
Application of Population Genetic Theory and Simulation Models to Efficiently Pyramid Multiple Genes via Marker-Assisted Selection
Crop Sci.,
March 1, 2007;
47(2):
582 - 588.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. Wang, M. van Ginkel, R. Trethowan, G. Ye, I. DeLacy, D. Podlich, and M. Cooper
Simulating the Effects of Dominance and Epistasis on Selection Response in the CIMMYT Wheat Breeding Program Using QuCim
Crop Sci.,
November 1, 2004;
44(6):
2006 - 2018.
[Abstract]
[Full Text]
[PDF]
|
 |
|