|
|
||||||||
a Institute of Plant Breeding, Seed Science, and Population Genetics, Univ. of Hohenheim, 70593 Stuttgart, Germany
b Crop Science Dep., Univ. of Illinois, 1102 South Goodwin Avenue, Urbana, IL 61801
c International Maize and Wheat Improvement Center (CIMMYT), Apdo. Postal 6-641 06600 Mexico D.F., Mexico
d Institute of Crop Breeding and Cultivation, Chinese Academy of Agric. Sciences, Zhongguancun South Street 12, 100081, Beijing, China
* Corresponding author (melchinger{at}uni-hohenheim.de).
| ABSTRACT |
|---|
|
|
|---|
Abbreviations: CIMMYT, International Maize and Wheat Improvement Center HWE, Hardy-Weinberg equilibrium ME, megaenvironment LD, linkage disequilibrium MRD, modified Roger's distance PC, principal coordinate PCoA, principal coordinate analysis QTL, quantitative trait locus SSR, simple sequence repeat
| INTRODUCTION |
|---|
|
|
|---|
Association mapping was proposed as one approach to detect genes and alleles of interest in germplasm collections (Lynch and Walsh, 1997). The resolution of association studies in a sample depends on the extent of linkage disequilibrium (LD) across the genome. LD (or the correlation between alleles of different loci) depends generally on the genealogy of the germplasm. Besides this, drift and selection within populations can also cause LD. The genomic structure of LD must be empirically determined before embarking on association studies because it can vary among samples of germplasm. The advent of PCR-based molecular markers such as SSRs has created an opportunity for fine-scale genetic characterization of germplasm collections. Since SSR markers are highly polymorphic (Smith et al., 1997), easy to generate, and highly repeatable (Heckenberger et al., 2002), they can be used for large-scale investigations as needed in the case of genetic resources (Powell et al., 1996).
CIMMYT developed and improved from 1964 until 1973 a wide array of maize germplasm. Populations were established with materials from a single racial complex. In 1974, a major shift in the organization of the germplasm was initiated. Germplasm from different racial complexes was mixed and more than 100 populations were established to (i) reduce the large collection of germplasm from CIMMYT's gene bank to a number that can be handled efficiently in a breeding program and (ii) use the combining ability of different germplasm sources for intrapopulation improvement. In addition, 30 broad-based back-up pools were formed as an insurance against narrowing the germplasm base of the populations (CIMMYT, 1998). These pools and populations have played an important role in maize breeding and production in developing countries and have been exploited as sources of new germplasm for temperate regions (Ron Parra and Hallauer, 1997). Detailed knowledge about LD and genetic diversity of these populations would increase the efficiency of their use in breeding. However, little is known about the molecular diversity in tropical and subtropical maize populations (Warburton et al., 2002) and information about LD in this germplasm is entirely lacking.
The main objectives of our study were to characterize the population genetic structure of 23 CIMMYT maize populations as a basis for an efficient use of this germplasm in breeding programs. In particular we (i) investigated the molecular genetic diversity within and among 23 of CIMMYT's maize populations, (ii) examined genotype frequencies for deviations from Hardy-Weinberg equilibrium at individual loci, and (iii) tested for LD between pairs of loci.
| MATERIALS AND METHODS |
|---|
|
|
|---|
|
Statistical Analyses
The number of alleles per locus (further referred to as allelic richness) was determined for the entire set of 672 individuals analyzed and for various subsets within this collection (populations, MEs). The existence of population- or ME-specific alleles was determined. The total gene diversity (HT) based on SSR data across all populations was decomposed into (i) gene diversity between individuals within each population (HS) and (ii) gene diversity between populations within each ME (HME) according to Nei (1987)(p. 164) and Chakraborty (1980). Confidence intervals for HS values were obtained by a bootstrap procedure with resampling across markers and individuals. The coefficient of gene differentiation (GST) was used as a measure of genetic differentiation between populations of the same ME and different MEs and was calculated according to Nei (1987)(p. 190). GST is the proportion of the total genetic diversity that is due to differences between MEs. The fixation index FIS for each population was estimated according to Nei (1987)(p. 164) as one minus the observed heterozygosity divided by the expected heterozygosity.
The average number of alleles, the number of unique alleles, HS, and HME depend on the number of individuals analyzed per population. In the tropical populations, 48 individuals were sampled, whereas in the subtropical and temperate populations only 21 individuals were sampled per population. Therefore, we used a resampling strategy to obtain comparable estimates: a random sample of 21 individuals from each of the tropical populations was chosen for the above analyses, sampling was repeated 1000 times, and the results were averaged.
The modified Roger's distance (MRD) between two populations or individuals was calculated according to Wright (1978)(p. 91) and Goodman and Stuber (1983). Standard errors of MRD estimates were calculated by a bootstrap procedure with resampling across markers and individuals. Associations among operational taxonomic units were revealed by principal coordinate analysis (PCoA) (Gower, 1966) based on MRD values. PCoA were performed for (i) the 23 populations and (ii) all individuals of the populations from each ME. In the latter case, individuals with more than 30% missing values were excluded. All analyses were performed with Version 2 of the Plabsim software (Frisch et al., 2000), which is implemented as an extension to the statistical software R (Ihaka and Gentleman, 1996).
Alleles with frequencies smaller than 0.10 were pooled for each locus for tests of Hardy-Weinberg equilibrium (HWE) and LD because disequilibrium coefficients have large variances with rare alleles. The population genetic software Arlequin (Schneider et al., 2000) was used for tests of HWE at individual loci and LD between pairs of loci. Software Arlequin uses the procedure described by Guo and Thompson (1992) to detect significant departures from HWE. LD between all pairs of loci was tested within each of the seven tropical populations using a likelihood-ratio test, whose empirical distribution is obtained by a permutation procedure (Slatkin and Excoffier, 1996). This test assumes HWE at each locus and, thus, only loci with no significant deviation from HWE were included in the analysis. An LD analysis of the 16 subtropical and temperate populations was not considered due to the small sample size of 21 individuals per population. In testing for both HWE and LD, the Bonferroni correction for multiple tests was applied (Snedecor and Cochran, 1980).
| RESULTS |
|---|
|
|
|---|
|
|
= 0.05.
Relationships between Populations
Values of MRD between pairs of populations averaged 0.28 and ranged from 0.20 (P22 x Pl24) to 0.41 (P32 x P48) with significant differences (P < 0.01) between MRD estimates (Table 3). The average MRD between all pairs of populations within MEs ranged from 0.22 (temperate ME) to 0.26 (subtropical intermediate-maturity ME) and averaged 0.25. The average MRD between all pairs of populations of different MEs was maximum for tropical x subtropical early-maturity populations (0.32) and minimum for subtropical early-maturity x temperate populations (0.24).
|
|
|
| DISCUSSION |
|---|
|
|
|---|
Hardy-Weinberg Equilibrium
The Hardy-Weinberg law describes the fundamental observation that in a large random-mating population both gene frequencies and genotype frequencies are constant across generations assuming absence of migration, mutation, and selection. The genotype frequencies are determined by the gene frequencies (Falconer and Mackay, 1996, p. 5). CIMMYT breeders have maintained their populations by planting a minimum of 20 rows and 21 plants per row. All plants consistent with the varietal description were shoot bagged and pollen from 10 rows was bulked to pollinate plants in the other 10 rows and vice versa. A minimum of 300 to 350 typical ears from the pollinated plants was chosen to represent each population. Considering the procedure to maintain the germplasm, it was expected that the populations would be in HWE after one generation of random mating. However, all 23 maize populations deviated significantly from HWE (Fig. 1) and showed a deficit of heterozygous loci (Table 2). This is in agreement with previous reports on other maize populations. Labate et al. (2000) investigated two random-mated maize populations and found that 27% of tests for deviation from Hardy-Weinberg equilibrium were significant, with deviations occurring due to an excess of homozygosity of 72 and 87%. Dubreuil and Charcosset (1998) also detected an excess of homozygosity in 10 populations from Europe and the U.S. using RFLP markers. In 17 open-pollinated populations assayed at 13 enzyme marker loci, 27% of Hardy-Weinberg tests were significant, with 94% showing an excess of homozygosity (Kahler et al., 1986).
The inbreeding of the populations observed in our study can be related to various causes: (i) positive assortative matings between individuals (homogamy), (ii) artificial subgrouping of individuals from populations, (iii) selection favoring homozygotes, and (iv) experimental errors during the laboratory assay for SSRs. Even though precautions were taken to avoid positive assortative mating between individuals, it cannot be excluded entirely because late flowering plants are preferentially crossed to late ones, and early flowering plants with early ones. However, only SSRs closely linked to QTL for flowering time should show a higher degree of homozygosity than expected under HWE. Assortative mating can be one reason for an artificial subgrouping of individuals, but a closer examination of the PCoAs (Fig. 3) did not provide any clue that the deficit of heterozygous individuals could be related to a subgrouping of individuals from populations. Selection favoring homozygotes is unlikely in maize, where fitness increases with heterozygosity. The choice of SSRs with tri- and higher repeats in our study and the use of the Local Southern sizing method to estimate the fragment sizes, which represents a conservative allele-calling procedure, reduce the laboratory error sources, which cause overestimation of heterozygosity. However, most experimental errors would lead to an overestimation of homozygotes because (i) a heterozygous locus carrying a null allele would be scored as a homozygous locus, (ii) alleles could not be detected because of competition during the PCR reaction, and (iii) the setting of the threshold of band intensity to detect alleles can be too strict.
Thus, experimental errors are probably the major cause of heterozygote deficiency within the populations apart from genuine genetic causes. To separate both sources, it would be prudent in future studies to include, besides the two inbred checks, their hybrid as a control to estimate the error rate for misscoring of heterozygous loci.
Linkage Disequilibrium
LD can result from and be maintained by epistasis (Falconer and Mackay, 1996, p. 16). It can also arise from admixture of populations with different gene frequencies, or from drift in small populations. Since population admixture happened during the establishment of the tropical populations recently, this could have caused LD. However, we found that less than 0.3% of the two-locus disequilibrium tests were significant, which can be explained by type I error alone. This is in accordance with a study reported by Stuber et al. (1980), who evaluated LD among eight enzyme loci in four long-term maize selection experiments. In contrast to these results, Remington et al. (2001) reported evidence of genome-wide LD among 47 SSRs for 102 maize inbred lines from temperate and tropical regions. LD was reduced but not eliminated by grouping lines into three empirically determined subpopulations. Nevertheless, artificial population admixture within the subpopulations caused by sampling lines from different germplasm sources could be one reason for the detection of LD in this survey. On one hand, the lack of LD in our study can be explained by the low-density marker map and the decrease of LD with successive generations of intermating since the establishment of the populations. On the other hand, the sample size of 48 individuals per population, the precision in estimating haplotype frequencies with the EM algorithm (Excoffier and Slatkin, 1995), and the elimination of loci deviating from HWE (Fig. 1) result in a low power to detect LD. Further investigations are required to examine the influence of the sample size and the structure of the population on the power of detecting LD.
Molecular Diversity of the Populations
We observed a higher total molecular allelic richness (8.02 alleles per locus) and average molecular allelic richness of the populations adapted to different MEs (5.7 alleles per locus) than reported in previous SSR studies of maize germplasm, although the Local Southern sizing method used to estimate the fragment sizes represents a conservative allele-calling procedure. Labate et al. (2003) found an average of 6.5 alleles per locus analyzing 461 plants representing a diverse array of U.S. germplasm. Matsuoka et al. (2002) found, on average, 6.9 alleles per locus for 101 maize inbred lines representing three major germplasm sources (Tropical, U.S., and Canadian/European inbreds). The total gene diversity across all populations (0.62) in our study was the same as reported by Matsuoka et al. (2002). The high molecular allelic richness (Table 1) and gene diversity values in our study confirm the broad genetic base of the populations expected from the pedigree data (Table 1).
The allelic richness and number of unique alleles were significantly higher for the tropical populations (6.07 alleles per locus, 86 alleles, respectively) than for populations adapted to the three other MEs (5.86, 5.43, and 5.34 alleles per locus, 37, 23, and 22 alleles). However, the results of the resampled tropical populations obtained from the same number of individuals per population as sampled in each of the subtropical and temperate populations clearly demonstrated the importance of the number of individuals investigated: the more individuals that are sampled, the higher is the probability of detecting rare alleles. The low difference of the pools compared with the populations with respect to average gene diversity (HS), number of alleles, and number of unique alleles (Table 2) was surprising because (i) the pools have been assessed with a larger effective population size than were the populations and (ii) new material has been regularly introgressed into the pools. Our results indicate that a loss of rare alleles in the populations caused by drift seems to be uncommon and suggests that maintaining back-up pools is not necessary.
Genetic Structure of the Populations
PCoA based on MRD of the populations (Fig. 2) clearly supported the ME structure. PC1, which explained 23.6% of the total variance, revealed a major split between the (i) tropical, (ii) subtropical intermediate-maturity, and (iii) subtropical early-maturity and temperate ME. The position of the four temperate pools between the two groups of the subtropical early-maturity populations (P46, Pl27 vs. P48, Pl30) can be explained by the germplasm base of these populations and pools (Table 1) and the similar selection pressure applied while adapting them to winter maize areas in the subtropics and tropics.
Most of the variation was found within the populations and just a minor part (9% on average) between the populations. The higher GST values for the tropical, subtropical intermediate-maturity, and subtropical early-maturity populations (0.10, 0.09, and 0.09, respectively) than for the temperate populations (0.07) can be explained by the pedigree information (Table 1). In the temperate populations, many germplasm sources were combined to establish broad-based pools comprising different racial complexes. An analysis of GST values for individual loci revealed that the following SSRs were associated with the structuring of the germplasm: phi014, phi031, phi053, and phi112. Such a tendency may indicate that the chromosomal regions harboring these SSRs are not selectively neutral. Several studies reported QTL for the anthesis-silking interval in the vicinity of phi014 and phi031 (Ribaut et al., 1996; Veldboom et al., 1994). QTL for days to pollen were reported in chromosomal regions near phi053 and phi112 (CIMMYT, unpublished data). This seems to be an interesting starting point for further fine-scale and association mapping approaches of the underlying genes.
The clustering observed in the tropical populations is largely consistent with the pedigree information (Table 1). Pl24 was formed of Tuxpeño germplasm. P21 was established from seven Tuxpeño races and some families from Pl24. Although P43 was derived from Tuxpeño germplasm, it did not cluster closely with P21, consistent with field data that show high levels of heterosis between P21 and P43. In addition to Tuxpeño germplasm, P22 and P29 contain other materials such as ETO or Cuban flint. However, the results of the PCoA suggest that both populations (P29 and P22) contain mainly Tuxpeño germplasm. P21 and P32 were widely separated in the PCoA, consistent with numerous reports showing substantial heterosis between Tuxpeño and ETO germplasm (Wellhausen, 1978).
In the subtropical intermediate-maturity populations, individuals of P34 and P42 clustered together, consistent with the pedigree information. P42 and P34 both contain ETO germplasm. The latter includes also Cuban flints and Tuxpeño germplasm. The wide distribution of Pl31 over the first and second PCs can be explained by the broad range of germplasm used in its formation (Table 1). Individuals of Pl31 overlapped with individuals of P47 and Pl34, which is again consistent with pedigree information. P47 was formed using 276 half-sibs of Pl32, which itself was established with germplasm from the same sources as Pl31. Individuals of P45 are adjacent to individuals of P33 and have an intersection with them. P45 contains mainly Tuxpeño and U.S. dents but also Cuban flint, the latter being related to Cateto flint from P33 (Goodman and Brown, 1988).
In the subtropical early-maturity populations, two clearly separated clusters were observed: individuals of Pl27 and P46 vs. Pl30 and P48. Individuals of Pl28 were positioned midway between these two groups, which was again in accordance with pedigree information. Pl27 and P46 were both established using flint germplasm from different countries. P48 was generated from 54 half-sib families of Pl30, which was established from dent germplasm from Europe, China, Lebanon, South America and the U.S. Corn Belt. In contrast, Pl28 was developed by mixing flint and dent germplasm from Pl27 and Pl30.
In the temperate populations, the individuals of the four pools were widely spread over the first and second PC and only Pl42 was separated from the three other pools. This reflects nicely the selection history and the establishment of the germplasm (Table 1). Pl42 was formed to introduce tropical germplasm into temperate areas, whereas Pl39, Pl40, and Pl41 were designed to introgress temperate germplasm for the winter maize areas in the subtropics and tropics.
The analysis of the 23 maize populations clearly revealed that most of the genetic diversity is within the populations and just a minor part between the populations. This can be explained by the establishment of the populations and pools, which mostly disregarded racial complexes, and suggests that the applied procedures to handle the broad range of available germplasm was suboptimal with regard to (i) maintaining maximum genetic diversity within the populations and (ii) conserving genetic diversity between the populations. It is rather likely that desired alleles, which occurred with high frequency in just one racial complex, can be lost by mixing different germplasm sources.
Germplasm based on different racial complexes might be useful for the improvement of openpollinated varieties. However, this germplasm is less suitable for hybrid breeding, where clearly distinct heterotic groups are advantageous (Melchinger, 1999). The reduced genetic diversity among the populations caused by admixture can only be recovered by long-term isolation or reciprocal recurrent selection programs. Therefore, only the few populations based on one racial complex (P21, P32, P33, P42, P43, and Pl24) seem to be suitable for hybrid breeding programs. If no populations based on one racial complex are available for a certain ME, breeders can use either populations not adapted to the ME or landraces in their search for germplasm suitable for hybrid breeding.
| ACKNOWLEDGMENTS |
|---|
Received for publication April 30, 2003.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
X. M. Fan, H. M. Chen, J. Tan, C. X. Xu, Y. M. Zhang, Y. X. Huang, and M. S. Kang A New Maize Heterotic Pattern between Temperate and Tropical Germplasms Agron. J., June 16, 2008; 100(4): 917 - 923. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. L. Warburton, J. C. Reif, M. Frisch, M. Bohn, C. Bedoya, X. C. Xia, J. Crossa, J. Franco, D. Hoisington, K. Pixley, et al. Genetic Diversity in CIMMYT Nontemperate Maize Germplasm: Landraces, Open Pollinated Varieties, and Inbred Lines Crop Sci., March 19, 2008; 48(2): 617 - 624. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| The SCI Journals | Agronomy Journal | Vadose Zone Journal | |||
| Journal of Natural Resources and Life Sciences Education |
Soil Science Society of America Journal | ||||
| Journal of Plant Registrations | Journal of Environmental Quality |
The Plant Genome | |||