|
|
||||||||
a Institute of Plant Breeding, Seed Science, and Population Genetics, University of Hohenheim, 70593 Stuttgart, Germany
melchinger{at}uni-hohenheim.de
| ABSTRACT |
|---|
|
|
|---|
Abbreviations: BCt, tth backcross generation cM, centimorgan MDP, marker data points QTL, quantitative trait locus RPG, recurrent parent genome
| INTRODUCTION |
|---|
|
|
|---|
Computer simulations have proved to be a powerful tool for investigating the design and efficiency of marker-assisted selection programs (for review see Visscher et al., 1996). These authors studied marker-assisted QTL introgression in an animal breeding context, using an infinitesimal model to explain differences among breeds. Hospital and Charcosset (1997) determined the optimal position and number of marker loci for manipulating QTL in foreground selection. Further, they investigated the combination of foreground and background selection in QTL introgression. Openshaw et al. (1994) determined the population size and marker density required in background selection. They recommended the use of four markers per chromosome (of 200-cM length) and a selection strategy for proximal recombinants of the target allele.
Although efficient PCR-based DNA markers such as simple sequence repeats and amplified fragment length polymorphisms are available (Ribaut et al., 1997), their use in background selection is restricted by the large number of required MDP. In this study, we investigate strategies for reducing the total number of MDP needed in background selection. Our research objectives were to (i) determine the number of MDP required in background selection, (ii) investigate the effects of varying population sizes from early to late backcross generations on the level of RPG and the MDP required, and (iii) compare a two-stage selection procedure, consisting of one foreground and one background selection step, with alternative selection procedures consisting of one foreground selection step and two or three background selection steps.
| Methods |
|---|
|
|
|---|
Algorithm
Software PLABSIM (Frisch et al., 1999b), a computer program written in C++, was used to simulate the recombination process during meiosis. Crossover events were generated by a random-walk algorithm (Crosby, 1973, p. 237). Recombination frequencies required for the random walk were calculated from the map distance by Haldane's (1919) mapping function. This assumes that neither chiasma interference nor chromatide interference (Stam, 1979) occur. To check our simulation software, the original linkage map of Schön et al. (1994), which was based on experimental F2 data, was compared with a linkage map constructed from simulated data of F2 individuals by MAPMAKER software (Lander et al., 1987). Both maps were in excellent agreement, confirming that the models underlying the two software packages were similar.
Simulation Runs
Each simulation of a backcross program started by the cross of two parents, which were assumed to be homozygous and polymorphic at all loci (target locus, marker loci, background loci). The recurrent parent was assumed to carry the desirable alleles at all loci of the genome except for the target locus. The donor parent was assumed to carry the desirable allele at the target locus in homozygous state. One heterozygous F1 individual was backcrossed with the recurrent parent and n1 BC1 individuals were produced. The best BC1 individual was selected according to the selection strategies described below and, for production of generation BC2, backcrossed with the recurrent parent. This procedure was repeated for t backcross generations. For the selected individual in each generation BCt, the percentage of the RPG was determined by dividing the number of loci (marker and background loci) homozygous for the recurrent parent allele by the total number of loci monitored. Furthermore, each analysis of a marker locus in a backcross individual was counted as a MDP. In BC1, the entire set of markers was analyzed (at least in the individual selected as parent for producing generation BC2). In the following generations, only those markers not fixed for the recurrent parent allele in the nonrecurrent parent (i.e., individual selected in the previous generation) were analyzed. The number of MDP required in each generation was counted and summed over the whole backcross program. The simulation of each backcross program was repeated 10000 times to reduce sampling effects and obtain results with sufficient numerical accuracy.
Threshold for the RPG
The values gained from these 10000 repetitions can be regarded as realizations of random variables that describe the proportion of RPG and the total number of MDP required after t generations in a backcross program with the parameter settings considered. The 10% percentile of the empirical distribution of the RPG in the selected individual (Q10) is used as an estimator for the amount of RPG reached after selection in generation BCt with probability 0.90. Compared with arithmetic means, percentiles have two advantages.
Simulations to Determine Threshold Values
A full backcross program usually consists of six generations (Allard, 960, p. 155). Hence, the Q10 values reached in generation BC6 by applying random selection among all individuals carrying the target allele was used as a termination threshold for a marker-assisted backcross program. This threshold was determined by simulations with selection only for presence of the target allele but no selection for any marker loci.
Selection Strategies
For describing our selection strategies in general terms, we consider a chromosome carrying the target locus (carrier chromosome) of length l0 and c further chromosomes (non-carrier chromosomes) with length lc. Positions on the chromosomes are represented by a scale in Morgan units ranging from 0 to lc. The target locus is located at position x on the carrier chromosome and two flanking markers at positions y1 and y2; i additional markers on the target chromosome are located at positions zi. On the non-carrier chromosomes are altogether m markers positioned at positions uck. Let X, Y1, Y2, Zi, and Uck be indicator variables, which take the value 1, if the corresponding locus is homozygous for the recurrent parent allele and 0 otherwise. From these random variables we obtain the count variables
. Furthermore, we define the indicator variable Z, which is 1 if all i additional markers on the carrier chromosome are homozygous for the recurrent parent allele and 0 otherwise.
By means of the random variables X, Y, Z, and U as selection indices, three sequential selection strategies were applied. The first step always involved selection of individuals carrying the target allele
. Subsequently one, two, or three steps with background selection followed (Table 1)
. In each selection step, only those individuals selected in the previous step are subjected to marker assays. In the selected individual for producing the next backcross generation, all markers not fixed in the previous generation(s) are assayed to determine homozygosity and, hence, which need not to be assayed in the following generation(s).
|
Population Size
Backcrossing with a constant number of individuals in each generation
was compared with backcrossing, in which the population size nt varied from BC1 to BC3. The total number of individuals ont = 300 was allocated to backcross generations BC1:BC2:BC3 with ratios of 3:2:1, 1:1:1, 1:2:4, 1:2:3:, 1:3:5, and 1:3:9.
| Results |
|---|
|
|
|---|
|
|
Three-stage selection with constant nt yielded lower Q10 values than two-stage selection only in BC1 and BC2, but in subsequent backcross generations the difference was only marginal especially for greater nt values (Table 3). Increasing nt from 20 to 60 resulted in a substantial increase of Q10 values only up to BC3 but not in later backcross generations. Likewise, increasing nt beyond 60 resulted only in marginal gains in Q10. In comparison with two-stage selection, less than half the total number of MDP were required in a three-generation backcross program for all values of nt. This reduction was attributable to considerable savings in BC1 (Table 3).
For four-stage selection with constant nt, the Q10 values followed the same trends as for three-stage selection. Corresponding Q10 values never exceeded those for the latter procedure, but differences were negligible after generation BC2, irrespective of the choice of nt (Table 3). However, the total MDP number was reduced, compared with three-stage selection (about 15% for nt = 20 and 28% for nt = 200), and even more when compared with two-stage selection.
Variation in nt values for BC1 to BC3 with the restriction ont = 300 hardly influenced the Q10 values reached in BC3 under two-stage selection (Table 4) . In contrast, the number of MDP required was strongly reduced with larger values for nt in advanced backcross generations. In comparison to the ratio 1:1:1, increasing ratios of nt reduced the required number of MDP up to 50%, while decreasing ratios of nt increased the required number of MDP up to 150%. Variation of nt in three- and four-stage selection had only marginal influence on both the RPG and the required number of MDP for ratios of 3:2:1 to 1:2:4. A reduction in RPG was observed for the ratio 1:3:9 (Table 4).
|
| Discussion |
|---|
|
|
|---|
r. Here, i denotes the selection intensity,
the standard deviation of the RPG, and r the correlation between the proportion of recurrent parent alleles at marker loci and the proportion of recurrent parent alleles across the whole genome. Values of
and r for the three selection strategies are given in Table 5
.
|
Marker-assisted selection is different from selection for a quantitative character, where a high selection intensity in early generations can take advantage of the large segregation variance among individuals. There is no such optimum generation for applying high selection intensities in marker-assisted background selection. If large BC1 population sizes are chosen, the response to selection is high due to large values of
and r (Table 5). However, in each of the following backcross generations this initial gain in RPG is halved. In contrast, the response to background selection achieved by large population sizes in the last backcross generation is fully recovered in the breeding product and not diluted by further backcrossing, even if due to smaller
and r values (Table 5) the absolute values of the response to selection are smaller in advanced backcross generations. A compensation of both effects explains why in BC3 the content of RPG in the selected individual is hardly influenced by the ratio of population sizes used in BC1 to BC3, given a constant total number of individuals.
Compared with two-stage selection, in three-stage or four-stage selection greater emphasis is given to the carrier chromosome in generation BC1. This is illustrated by the low value of r = 0.38 for the carrier-chromosome in BC3 under four-stage selection (Table 5). Because of a high selection pressure in early backcross generations, almost all markers on the carrier-chromosome are homozygous for the recurrent parent allele. Hence, they describe only poorly the differences in RPG that still do exist between the individuals.
Preferential selection of individuals with high RPG content on the carrier chromosome in BC1 and BC2 results in a lower overall RPG content, because the non-carrier chromosomes, on which only a reduced selection pressure is applied, form the major part of the genome. In three- or four-stage selection, non-carrier chromosomes selection is less intensive in BC1. Therefore the corresponding value for r in BC3 is distinctly higher. This results in efficient BC3 selection, which compensates for the lower RPG values derived from BC1 and BC2.
Number of Marker Data Points Required
The major portion of MDP required in a two-stage selection program with constant nt is required in generation BC1 (Table 4). Its expectation is mn1/2, where m is the total number of marker loci. A reduction in n1 results in a proportional reduction of the MDP required in generation BC1 (Table 4). In advanced backcross generations, many marker loci are already fixed for the recurrent parent allele. This results in a substantial MDP decrease if larger population sizes are used in advanced backcross generations instead of BC1 or BC2.
In the second selection step of three-stage selection, only the flanking markers are analyzed in all carriers of the target allele. Hence, instead of mn1/2 MDP only n1 MDP are required by expectation. Subsequently, analysis of the remaining marker loci in the third selection step requires (m - 2)a MDP for the a preselected individuals. This smaller number of MDP in generation BC1 results in the observed overall MDP reduction (up to 50%) (Table 4). In four-stage selection, a further MDP reduction is achieved by investigating only the i non-flanking markers on the carrier chromosome in the third selection step. This requires ia MDP instead of (m - 2)a. The whole marker set is only analyzed on the b individuals preselected in the third step, which requires (m - 2 - i)b MDP.
Transferability to Other Situations in Breeding
Like simulations in general, the results presented in this study depend on the underlying model. In the present context, simulation results are influenced by (i) the theoretical assumptions underlying the simulation of the meiotic recombination and (ii) the choice of genetic and dimensioning parameters.
We chose the map of Schön et al. (1994) because it represents a typical linkage map used in breeding programs. To investigate the robustness of our results with regard to the target allele position, we analyzed two additional scenarios.
Simulations with varying linkage maps demonstrated that an average marker density higher than 20 cM results only in a marginal increase of Q10 values, but requires a substantially larger number of MDP (Frisch et al., 1998). In generation BC1 and BC2, a chromosome only consists of several segments of different origin (for a chromosome of length l, the expected number of segments in BC1 is l + 1). Hence, the bottleneck limiting marker-assisted selection in early backcross generations is the number of chromosome segments itself, not the number of markers used for monitoring the composition of the chromosomes.
With a linkage map with equally spaced markers (Frisch et al., 1998), smaller population sizes and fewer MDP were required than with the linkage map underlying this study, which has regions of 60 or 80 cM length not covered by markers. For example, with a linkage map uniformly covered by markers, a saving of four backcross generations can be achieved with population sizes that resulted in a saving of three backcross generations with the linkage map used in this study (Frisch et al., 1998). This shows that an equally covered linkage map is mandatory for obtaining maximum RPG values in BC2 and BC3.
The differences in Q10 and MDP values between the selection strategies are caused by a different treatment of carrier and non-carrier chromosomes. Hence, the ratio between carrier and non-carrier chromosomes determines the different outcome of the selection strategies. The amount of reduction in the required number of MDP reported here is specific for 10 chromosomes and map length of 16 Morgan. In crops with genomes consisting of less than 10 chromosomes, the differences are expected to be smaller, because the ratio between carrier and non-carrier chromosomes increases. For more than 10 chromosomes, the proportion of genome on the non-carrier chromosomes increases and, consequently, the differences between the selection strategies are expected to be greater.
The presented results should cover a wide range of gene introgression programs in crops with 2x = 20 and also 2x = 18 chromosomes, such as maize or sugar beet (Beta vulgaris L.). For different linkage maps, our simulation software PLABSIM (Frisch et al., 1999b) can be used for conducting simulations to compare the effect of selection strategies or breeding designs in marker-assisted backcrossing.
Design of Marker-Assisted Backcross Programs
Tanksley et al. (1989) stated that a sufficiently high proportion of the RPG is recovered after three generations of marker-assisted backcrossing. Hospital et al. (1992) expected a saving of two backcross generations because of marker-assisted background selection. This is in accordance with our simulations, resulting in a saving of two to four backcross generations in the transfer of a single target allele (Table 3).
The backcross procedure can be terminated after four instead of six backcross generations even with small population sizes and a limited number of MDP (Table 2). This demonstrates that marker technology can be advantageous even when the resources in a breeding program are limited. A shortening from six to three backcross generations can be regarded as a realistic goal for practical breeders, because moderate population sizes and number of MDP are required, and the breeding program is two times faster than it is without markers. As demonstrated by our results, marker-assisted selection has the potential to reach in generation BC3 the same level of RPG as reached in BC7 without use of markers. However, large numbers of MDP are required to unlock this potential. With the marker systems presently available, this application is yet unrealistic or at least not economic.
In generations BC1 and BC2, two-stage selection is superior to three- and four-stage selection because it reaches a larger RPG proportion with a given population size (Table 3). Thus, two-stage selection seems appropriate in two-generation backcross programs with limited population size. Furthermore, it can be applied without information about the marker linkage map and, hence, is the only option for application in generation BC1, if no marker linkage map is available.
An increasing population size nt is preferable over a constant population size in a two-stage selection program, because the number of marker analyses is reduced without reducing the Q10 values. Limits for varying nt are practical restrictions for handling large values of n3 and the risk of loosing the target allele in BC1 with low values of n1. With probability
, none of the n1 backcross individuals carries the target allele. Hence, a minimum of 15 to 20 individuals per generation should be produced to obtain with almost certainty at least one carrier of the target allele.
Reduction of the linkage drag is one of the main goals in marker-assisted backcrossing (Tanksley et al., 1989). Theoretical results (Stam and Zeven, 1981) show that the donor segment attached to the target allele remains surprisingly large in backcrossing without marker-assisted selection even in advanced backcross generations. In introgression of target alleles from unadapted germplasm, linkage drag is the main cause for the differences between the recipient line and the converted line. Tightly linked flanking markers can be used for a substantial reduction of the linkage drag. Individuals with recombination between tightly linked loci have a low frequency in backcross populations, but may not be selected by applying two-stage selection. Hence, if reduction of the linkage drag has high priority, three- or four-stage selection should be applied. This avoids the necessity of additional backcross generations at the end of the breeding program to ascertain detection of a recombination event between tightly linked flanking markers and the target locus.
While three- and four-stage selection yield considerably lower RPG values in BC2 than two-stage selection, the slightly lower Q10 values reached in BC3 can be compensated by larger population sizes n3. Thus, without restrictions on n3, applying three- or four stage selection in three-generation backcross programs results in a reduction of the required number of MDP by as much as 50 or 75% (Table 3). They combine economic marker use with the possibility to efficiently reduce the linkage drag.
In a separate paper (Frisch et al., 1999a), we give equations for calculating the minimal population size for obtaining at least one carrier of the target allele homozygous for the recurrent parent allele at one or both flanking markers. The required population size depends on (i) the map distances between the flanking markers and the target allele and (ii) the chosen probability of success. These results can be used for the design of efficient three- and four-stage selection backcross programs in marker-assisted background selection.
| ACKNOWLEDGMENTS |
|---|
Received for publication November 24, 1998.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
M. Frisch and A. E. Melchinger Selection Theory for Marker-Assisted Backcrossing Genetics, June 1, 2005; 170(2): 909 - 917. [Abstract] [Full Text] [PDF] |
||||
![]() |
J.-M. Ribaut, C. Jiang, and D. Hoisington Simulation Experiments on Efficiencies of Gene Introgression by Backcrossing Crop Sci., March 1, 2002; 42(2): 557 - 565. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Frisch and A. E. Melchinger Marker-Assisted Backcrossing for Simultaneous Introgression of Two Genes Crop Sci., November 1, 2001; 41(6): 1716 - 1725. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Frisch and A. E. Melchinger Marker-Assisted Backcrossing for Introgression of a Recessive Gene Crop Sci., September 1, 2001; 41(5): 1485 - 1494. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Frisch and A. E. Melchinger The Length of the Intact Donor Chromosome Segment Around a Target Gene in Marker-Assisted Backcrossing Genetics, March 1, 2001; 157(3): 1343 - 1356. [Abstract] [Full Text] |
||||
![]() |
X. Wang, Y.-m. Woo, C. S. Kim, and B. A. Larkins Quantitative Trait Locus Mapping of Loci Influencing Elongation Factor 1{alpha} Content in Maize Endosperm Plant Physiology, March 1, 2001; 125(3): 1271 - 1282. [Abstract] [Full Text] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| The SCI Journals | Agronomy Journal | Vadose Zone Journal | |||
| Journal of Plant Registrations | Soil Science Society of America Journal | ||||
| Journal of Natural Resources and Life Sciences Education |
Journal of Environmental Quality |
||||