Crop Science Illumina
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Published online 31 May 2007
Published in Crop Sci 47:1018-1030 (2007)
© 2007 Crop Science Society of America
677 S. Segoe Rd., Madison, WI 53711 USA
This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Chao, S.
Right arrow Articles by Sorrells, M.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Chao, S.
Right arrow Articles by Sorrells, M.
Agricola
Right arrow Articles by Chao, S.
Right arrow Articles by Sorrells, M.
Related Collections
Right arrow Wheat
Right arrow Cell Biology & Molecular Genetics
Right arrow Crop Genetics

CROP BREEDING & GENETICS

Evaluation of Genetic Diversity and Genome-wide Linkage Disequilibrium among U.S. Wheat (Triticum aestivum L.) Germplasm Representing Different Market Classes

Shiaoman Chaoa,*, Wenjun Zhangb, Jorge Dubcovskyb and Mark Sorrellsc

a USDA-ARS Biosciences Research Lab., 1605 Albrecht Blvd., Fargo, ND 58105
b Dep. of Plant Sciences, Univ. of California, Davis, CA 95616
c Dep. of Plant Breeding and Genetics, Cornell Univ., 240 Emerson Hall, Ithaca, NY 14853

* Corresponding author (chaos{at}fargo.ars.usda.gov).


    ABSTRACT
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Genetic diversity and genome-wide linkage disequilibrium (LD) were investigated among 43 U.S. wheat (Triticum aestivum L.) elite cultivars and breeding lines representing seven U.S. wheat market classes using 242 wheat genomic simple sequence repeat (SSR) markers distributed throughout the wheat genome. Genetic diversity among these lines was examined using genetic distance-based and model-based clustering methods, and analysis of molecular variance. Four populations were identified from the model-based analysis, which partitioned each of the spring and winter populations into two subpopulations, corresponding largely to major geographic regions of wheat production in the United States. This suggests that the genetic diversity existing among these U.S. wheat germplasm was influenced more by regional adaptation than by market class, and that the individuals clustered in the same model-based population likely shared related ancestral lines in their breeding history. For this germplasm collection, genome-wide LD estimates were generally less than 1 cM for the genetically linked loci pairs. This may result from the population stratification and small sample size that reduced statistical power. Most of the LD regions observed were between loci less than 10 cM apart. However, the distribution of LD was not uniform based on linkage distance and was independent of marker density. Consequently, LD is likely to vary widely among wheat populations.

Abbreviations: AMOVA, analysis of molecular variance • CS, Chinese Spring • EST, expressed sequence tag • Fst, Wright's fixation index • HRS, hard red spring • HRW, hard red winter • HWS, hard white spring • HWW, hard white winter • LD, linkage disequilibrium • PCR, polymerase chain reaction • PIC, polymorphism information content • QTL, quantitative trait loci • SSR, simple sequence repeat • SRW, soft red winter • SWS, soft white spring • SWW, soft white winter • UPGMA, unweighted pair-group method with arithmetic mean.


    INTRODUCTION
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
WHEAT (Triticum aestivum L.) is the principal cereal grain grown in the United States for both domestic consumption and export. Improved cultivars and cultural practices have resulted in a steady increase in the average wheat yield during the past 30 yr, from 2000 kg ha–1 in 1970 to over 2700 kg ha–1 in 2004 (Wheat Yearbook, USDA, http://usda.mannlib.cornell.edu/MannUsda/viewDocumentInfo.do?documentID=1295). To maintain a steady rate of wheat improvement, exploring genetic diversity at the molecular levels by means of molecular genetics technologies and integrating such information with conventional breeding methods will be critical.

Genetic diversity plays a vital role in developing improved cultivars. Previously, the level and patterns of genetic diversity among U.S. wheat cultivars have been investigated among soft and hard red winter wheats (Cox et al., 1986), hard red spring wheat (Chen et al., 1994), soft winter wheat from the eastern United States (Kim and Ward, 1997), a sample of spring and winter wheat cultivars from the Pacific Northwest region (Barrett et al., 1998), and hard red winter wheat cultivars from the Northern Great Plains (Fufa et al., 2005). The measures applied in these studies and others can be summarized into four categories: pedigree data, phenotypic traits, storage proteins, and DNA markers. In general, the diversity estimates based on different methods were positively correlated (Parker et al., 2002; Fufa et al., 2005). Among DNA markers used, restriction fragment length polymorphisms (Kim and Ward, 1997; Paull et al., 1998), sequence tagged site (Chen et al., 1994), amplified fragment length polymorphisms (Barrett et al., 1998; Manifesto et al., 2001), and simple sequence repeats (SSRs) (Huang et al., 2002; Fufa et al., 2005) have all been found suitable for diversity studies.

Knowledge of the level of genetic diversity and familiarity with genetic and historical relationships among elite germplasm can further the exploitation of genetic variation in wheat. One genetic analysis that can be used to develop this knowledge is linkage disequilibrium (LD), or nonrandom association of alleles at adjacent loci within a population, which is the basis for association mapping strategies. In association mapping, phenotypic diversity is surveyed among a population of individuals with varying degrees of relationship, followed by identification of marker polymorphisms that correlate with phenotypic variation (Buckler and Thornsberry, 2002). Dissecting complex agronomic traits using LD-based mapping has recently been reported in crop plants such as rice (Oryza sativa L.) (Garris et al., 2003), maize (Zea mays L.) (Buckler et al., 2006), potato (Solanum tuberosum L.) (Simko et al., 2004), barley (Hordeum vulgare L.) (Kraakman et al., 2004), and common wheat (Breseghello and Sorrells, 2006). Association mapping depends on the patterns of LD, how far the usable levels of disequilibrium extend in the genome, and how much LD varies from one chromosome region or from one population to another. Factors such as the mating system, the recombination rate, population structure, population history, genetic drift, directional selection, and gene fixation at different rates on different chromosome regions can all affect the patterns of LD (Gaut and Long, 2003). To date, the extent of LD patterns in plants have been examined in Arabidopsis (Nordborg et al., 2002), maize (Remington et al., 2001), barley (Kraakman et al., 2004), rice (Garris et al., 2003), sorghum [Sorghum bicolor (L.) Moench] (Hamblin et al., 2004), durum wheat (T. turgidum L. var. durum) (Maccaferri et al., 2005), and loblolly pine (Pinus taeda L.) (Brown et al., 2004). These results have indicated that, in general, LD decay with distance occurs at a much slower rate in self-pollinated plants, such as Arabidopsis, rice, barley, durum wheat, and sorghum, than in outcrossing species, including maize and loblolly pine. In barley, however, it was further demonstrated that the extent of LD could vary dramatically between populations with different evolutionary histories (Caldwell et al., 2006). It was found that LD in a population of a wild barley progenitor rapidly decayed at a similar rate as a population of maize inbred lines. In common wheat, the LD patterns have been assessed for chromosomes 2D and part of 5A (Breseghello and Sorrells, 2006), and the distribution of LD was not uniform on these chromosomes.

Currently, data are available online for over 1700 genomic and expressed sequence tag (EST)-derived SSR markers, which have been characterized and genetically and/or physically mapped across all chromosomes for at least 27 wheat accessions (GrainGenes database, http://wheat.pw.usda.gov; verified 4 Mar. 2007). Their high information content (Plaschke et al., 1995), uniform distribution throughout the wheat genomes (Somers et al., 2004), and ease of use have made SSR-based markers a valuable tool. In addition, automated protocols for polymerase chain reaction (PCR)-based techniques and data collection have been developed for SSRs (Diwan and Cregan, 1997), which allow high throughput genotyping to be readily applied in wheat. In this report, we analyze the genetic diversity among 43 U.S. wheat cultivars and breeding lines representing seven market classes using a set of 242 SSR markers each mapped to a single chromosome location and distributed over all 21 chromosomes. In addition, the distribution and extent of LD over the portion of the wheat genome covered were estimated to determine the implications of applying association mapping in wheat.


    MATERIALS AND METHODS
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Plant Materials
Forty-three wheat cultivars, advanced breeding lines, and germplasm representing seven U.S. market classes were used for this study (Table 1). These lines were selected from 18 wheat breeding programs across the country and were used as parents to generate mapping populations segregating for a diverse array of agronomic traits as part of a collaborative project among U.S. public wheat breeding programs (http://maswheat.ucdavis.edu/, last update 26 Feb. 2007; verified 4 Mar. 2007). Among the 15 spring wheat genotypes representing three market classes, nine were hard red (HRS), two were hard white (HWS), and three were soft white (SWS). The germplasm IDO556 representing soft white club was grouped with SWS for analysis. Twenty-eight winter wheat genotypes representing four market classes included six hard red (HRW), five hard white (HWW), 10 soft red (SRW), six soft white (SWW), and a facultative SWS line, Q36. Also included in the analysis was cultivar Chinese Spring (CS) as a genotyping control for comparison with SSR marker data from previous studies using different assay methods. The source of CS was the Wheat Genetic Stock Collection located at USDA-ARS, University of Missouri, Columbia.


View this table:
[in this window]
[in a new window]

 
Table 1. Description of 43 wheat accessions included in this study.

 
SSR Marker Genotyping
Genomic DNA was extracted from individuals using seeds collected from one plant at the University of California, Davis. Nuclear DNAs were extracted from precipitated nuclei using the large-scale extraction method previously described by Dvorak et al. (1988). Altogether 1619 genomic and EST-derived SSR primer pairs were assayed in this study (Table 2). The marker identifiers and sources of the primer sequences for genomic SSR markers were gwm (Röder et al., 1998); gdm (Pestsova et al., 2000); barc (Song et al., 2002, 2005); cfa and cfd (Sourdille et al., 2001; Guyomarc'h et al., 2002); wmc (Gupta et al., 2002; http://wheat.pw.usda.gov/ggpages/SSR/WMC/; Daryl Somers, pers. comm.); and gpw (Sourdille et al., 2004). The EST-derived SSR markers, cnl and ksm, were described in Yu et al. (2004) and http://wheat.pw.usda.gov/ITMI/EST-SSR/. Polymerase chain reaction amplifications followed the M13-tailed primer PCR method (Schuelke, 2000) with the M13 oligonucleotide labeled with one of the four fluorescent dyes, 6-FAM, VIC, NED, and PET, added in the reaction mix. After multiplexing PCR products labeled with four different fluorescent dyes, the electrophoresis was performed on the Applied Biosystems (Foster City, CA) 3130xl Genetic Analyzer. GeneMapper software v3.7 (Applied Biosystems) was used for fragment analysis and allele calling. The detailed PCR amplification conditions and genotyping process were as described by Somers et al. (2004).


View this table:
[in this window]
[in a new window]

 
Table 2. Source and total number of simple sequence repeat (SSR) markers assayed, level of marker polymorphism, and number of SSR markers selected for this study.

 
Data Analysis
Among the 1619 SSRs assayed in this study, a subset of 242 SSR markers that were previously mapped to a unique and single chromosome location was selected for data analysis. Most (178/242) of the marker positions on each chromosome were based on the Ta-SSR-2004 consensus map (Somers et al., 2004). A CMap Matrix tool (http://www.genica.net.au/index.php?title=CMAP) was then used to position the remaining 64 markers by aligning and comparing the Ta-SSR-2004 map with either the wheat composite 2004 map or the Ta-Synthetic x Opata-SSR map. Markers with known physical bin map locations were also used to assist in positioning markers where their genetic map locations were not previously determined. Both the genetic and physical maps were accessed from the GrainGenes database. The fragment sizes for CS generated from the primers used in this study were compared with previously published sizes for CS, if available. The complete information for the 242 markers selected and their genetic and physical bin locations are available in the supplemental Table S1.

Gene diversity, defined as the probability that two randomly chosen alleles from the population are different (Weir, 1996); polymorphism information content (PIC) values, defined by Botstein et al. (1980); and total number of alleles at each SSR locus were calculated using the PowerMarker software (Liu and Muse, 2005). Gene diversity and number of alleles were determined for the entire set of samples as well as for each market class separately. Number of alleles unique to each market class was calculated using the CONVERT software v1.31 (Glaubitz, 2004). The methods implemented in the PowerMarker software were also applied to calculate the genetic distance among 43 samples based on Rogers' distance, the scaled Euclidean distance (Rogers, 1972), and to reconstruct a dendrogram using UPGMA. The Rogers' similarity coefficient factored in allele frequencies calculated from the codominant marker data, assuming no knowledge of evolutionary pressure on divergence of the lines under consideration, and is suitable to evaluate genetic similarity among germplasm derived from breeding programs (Reif et al., 2005). To test the reliability of the relationships among samples suggested by the clusters, a bootstrap analysis with 1000 replications was performed using the PowerMarker software. The consensus UPGMA tree with bootstrap values was reconstructed by the consensus program of Phylip v3.63 (J. Felsenstein, University of Washington, Seattle, WA) and displayed using the TreeView software (Page, 1996).

The genetic variation among and within populations of wheat lines was tested using analysis of molecular variance (AMOVA) implemented in Arlequin v3.01 (Excoffier et al., 2005). This approach considers the number of genotype differences and estimates the variance among and within the defined genetic structure. The variance within populations was expressed as Wright's fixation index (Fst) and the statistical significance of Fst was evaluated by permuting genotypes among or within all populations 1000 times. Pairwise Fst comparisons based on genotype differences were used to evaluate genetic diversity among populations.

Population structure was investigated using a Bayesian clustering approach to infer the number of clusters (populations) with the software structure v.2 (Pritchard et al., 2000). The algorithm attempts to identify genetically distinct subpopulations based on the patterns of SSR allele frequencies. Sixty-seven loosely linked SSR markers (>40 cM) with three to four markers distributed on 21 chromosomes were selected for structure analysis to minimize detecting background LD caused by tightly linked markers (Falush et al., 2003). No prior information was used to define the clusters, and the number of subpopulations (K) was set from two to seven. Runs with K = 3 to 5 were repeated at least three times. For each run, the burn-in period and the simulation run length were both set at 100000 with admixture model and correlated allele frequency (Falush et al., 2003). This method estimated the proportion of the genomes of each individual derived from the different clusters and assigned individuals to subpopulations based on membership probability. We used the run that assigned all the lines to a single cluster at a probability >0.50.

To evaluate LD, all accessions except one sister line, Reeder/BW-277S, were analyzed. Rare alleles with an allele frequency <5% along with null alleles and residual heterozygosity were treated as missing data. This resulted in eliminating data for 45 SSR markers out of the 242 SSR markers selected with >20% missing data. The final data set used to perform the LD analysis consisted of 197 SSR loci and 42 lines. The LD parameter, r2, and significance of each pair of SSR loci were estimated using the program TASSEL (http://www.maizegenetics.net; verified 10 Mar. 2007). The comparison-wise significance was computed using 1000 permutations and the proportion of permuted gamete distributions greater than observed gamete distribution served as an empirical P value (Weir, 1996). To obtain a critical value of r2 for significant linkage, unlinked estimates of r2 were square root transformed and the parametric 95th percentile of that distribution was taken as a population-specific critical value of r2 (Breseghello and Sorrells, 2006).


    RESULTS
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
SSR Diversity
In this study, a total of 1619 SSR markers developed in wheat were used to genotype 43 accessions and 742 (45.8%) of them detected at least one polymorphism (Table 2). In contrast to the EST-derived and gpw SSR markers, most of the genomic SSR markers showed high levels of polymorphism (Table 2). To evaluate genetic diversity among these 43 accessions, a subset of 242 single-locus genomic SSR markers was selected. They are widely distributed on both short and long arms of 21 chromosomes with an average of 10 per chromosome. These primers were also selected because they produced a high rate of amplification with <10% missing data including nulls. Null alleles were considered as missing data because of the difficulty in differentiating them from PCR failure.

Although genomic DNA was extracted from a single plant, a low level of heterozygosity with an overall mean of 0.9% was observed. The residual heterozygosity was kept in the data set used for genetic diversity analysis because of its minimal impact on the outcome of the analysis (data not shown). The number of alleles detected by each marker varied greatly ranging from 2 to 24 with a total of 1743 detected and a mean allele number of 7.2 per marker (Table 3). Polymorphism information content values ranged from 0.12 to 0.93 with a mean of 0.62 for all markers. Overall, markers in the B genome gave slightly higher PIC values compared with markers in A or D genomes (Table 3). Number of alleles, gene diversity and PIC values calculated for each SSR marker are shown in Table S1.


View this table:
[in this window]
[in a new window]

 
Table 3. Simple sequence repeat (SSR) marker distribution, total number of alleles detected, and polymorphism information content (PIC) values by chromosome.

 
To explore genetic diversity among genotypes within different market classes, the estimates of gene diversity and mean number of alleles were also calculated for each market class (Table 4). Both HRS and SRW lines were very diverse with mean allele numbers of 3.74 and 3.67, respectively, per locus, and gene diversity of 0.56 and 0.57, respectively. The highest number of alleles unique to the market class was found in the SRW class (19.2%). Despite a smaller sample size for SWW, the number of alleles unique to that market class was similar (15.9%) to HRS (16.1%). Nonetheless, the higher genetic diversity levels observed in HRS and SRW reflect the larger sample size included for analysis in this study.


View this table:
[in this window]
[in a new window]

 
Table 4. Summary of simple sequence repeat (SSR) marker diversity estimates among wheat market classes.

 
Genetic Relationships among U.S. Wheat Germplasm
To compare genetic relationships among the 43 accessions surveyed, a majority-rule consensus tree was built using a genetic-distance matrix based measure and UPGMA methods (Fig. 1). The soft red winter lines (SRW) adapted to the East and Southeast regions of the United States formed a clearly separated group supported by a high bootstrap value (90%). Within the second major cluster, also supported by a high bootstrap value (77%), the spring germplasm was grouped separately from the winter germplasm, but these two groups were supported with low bootstrap values (<50%), suggesting cross-breeding between spring and winter germplasm was involved in the ancestral lines of these samples. The winter lines that grouped in the second major cluster were further divided into three subgroups. One included all the hard winter genotypes (HRW and HWW) from Idaho and the Midwest production area (bootstrap <50%), whereas the other two were formed exclusively by soft cultivars (SWW), one from the Northeast (bootstrap 63%) and the other from the Pacific Northwest (bootstrap 76%) region of the United States. The mixture of HRW and HWW lines within the same cluster reflects the frequent intercrossing between these two gene pools, also suggested by the pedigree data. Within the spring cluster, one of two subgroups consisted of those originating from the Northern Great Plains region, including six HRS lines and one HWS line. The other contained mostly lines from the Pacific Northwest region, including all SWS lines, one HWS line, and two HRS lines developed at CIMMYT, Jupateco 73S and Weebill 1. The pedigree data (Table 1) indicated that most of the Pacific Northwest lines analyzed in this study have CIMMYT-derived lines present in their lineages, and thus, were consistent with the relationships based on markers. The HRS line PI610750 was also within this group but was distantly related to all other cultivars as expected for a synthetic derived cultivar.


Figure 1
View larger version (32K):
[in this window]
[in a new window]

 
Figure 1. A consensus UPGMA dendrogram of 43 wheat accessions constructed using the genetic distance-based method based on 1000 bootstrap replications. The bootstrap values >50% are shown in the tree. The market class each sample represents is indicated in the parentheses. The four model-based populations were depicted as {blacksquare} = Winter–East, {square} = Winter–West, {circ} = Spring–Northern Plains, and • = Spring–Pacific Northwest groups. *Q36, a facultative SWS line, was treated as SWW in this study.

 
Genetic Diversity and Population Structure among U.S. Wheat Germplasm
We used AMOVA to assess genetic diversity within and among market classes by dividing the population based on growth habit and market class. The results indicated that 81.2% of the genetic variation (P < 0.0001) resided within market classes and 16.6% (P < 0.0001) resided among market classes. Only 2.2% of the total variation was explained by spring and winter growth habit classes and was not statistically significant (P > 0.05). Genetic variation between market classes was tested using the Fst statistic estimated from pairwise comparisons as a measure for genetic distance between market classes. Pairwise comparisons showed considerable variation in the Fst values ranging from 0.087 to 0.290 (Table 5). A low level of population differentiation was present between HRS and HWS (Fst = 0.087), and HRW and HWW (Fst = 0.096). The differences between HWS and two other market classes, SWS and SWW, were not statistically significant (P > 0.05) (even though SWS and SWW were significantly different), suggesting the presence of some level of heterogeneity within HWS. This concurs with the results from the distance-based analysis, where the two HWS lines were clustered in two separate subgroups within the spring wheat cluster. For the remaining pairwise comparisons, a more substantial population differentiation was observed. The overall Fst value estimated within market classes was 0.188 indicating moderate population structure. The AMOVA test showed that the division of wheat accessions based on market classes resulted in the partitioning of the majority of the genetic variance to within each market class, but no significant difference was found between spring and winter growth habits. The low proportion of variation explained by growth habit is likely the result of divergent subgroups within each growth habit class. This is supported by the fact that all 12 pairwise comparisons between spring and winter subgroups were significantly different.


View this table:
[in this window]
[in a new window]

 
Table 5. Pairwise Wright's fixation index (Fst) statistics among wheat market classes.{dagger} Upper diagonal is Fst value, lower diagonal is P value.

 
We further explored the genetic structure among the samples using a model-based method. A model-based method is a cluster analysis that evaluates genetic similarity among genotypes without using prior information about the growth habit and the market class. This analysis suggested several theoretical subpopulation sizes with high significance; however, four subpopulations were optimal for assigning all except five lines into one of the four clusters at a posteriori probability >0.80. The five genotypes assigned to individual clusters with a posteriori probability >0.50 were McNeal, Thatcher, Wesley, Heyne, and the breeding line NY18/Clark's Cream 40–1. The proportion of membership clustered in each of the four subpopulations for each of the 43 accessions is shown in Table S2. These four model-based populations, indicated by different symbols in Fig. 1, corresponded well with their growth habit and market classification. The model-based analysis split the spring wheat from the winter wheat, which is consistent with breeding populations based on growth habit. The analysis further partitioned each of the spring and winter populations into two subpopulations and generally concurred with the genetic relationships based on the distance measure. However, the model-based clustering suggested evidence of substructure within HRS and SWW wheat that the genetic distance-based method did not detect. Splitting the SWW lines resulted in dividing all winter wheat lines into two major populations adapted to wheat production regions east and west of the Mississippi River. McNeal, a HRS cultivar developed in Montana, was more similar to the spring lines adapted to the Pacific Northwest than to the lines from the Northern Great Plains region. The Fst values for the four populations identified by the model-based analysis were 0.39 for Spring–Northern Plains group, 0.22 for Spring–Pacific Northwest group, 0.14 for Winter–East group, and 0.12 for Winter–West group.

Results from AMOVA for four model-based populations revealed that within-population differences among individuals accounted for 83.8% of genetic variation (P < 0.0001), 15.5% of the variation was attributed to differences among populations (P < 0.0001), and differences between winter and spring growth habits constituted 0.7% (P < 0.05) (data not shown). Overall Fst within populations was 0.1623. When Fst values were computed for pairwise comparisons of four model-based populations, highly significant population differentiation (P < 0.001) was detected among all pairwise combinations (data not shown).

Taken together, our analyses suggest that to better evaluate the extent and sources of genetic diversity among wheat accessions, it is necessary to assess and account for the presence of population substructure among samples. This is particularly important for samples with high levels of genetic diversity as reflected by a large proportion of genetic variance within populations from the AMOVA test.

Genome-wide LD Analysis and Distribution of LD on a Chromosomal Scale
The extent of genome-wide LD among the entire set of samples was evaluated through pairwise comparisons among 197 SSR loci (67, 59, and 71 on genomes A, B, and D, respectively) yielding 19306 estimates. Among them, 889 (4.6%) showed significant association at a comparison-wise 0.01 level. A large proportion of loci pairs (86%, 766/889) in LD were from different chromosomes. Out of 123 significant loci pairs within the same chromosomes, 70 of them were linked at <10 cM. Figure 2 shows the distribution of r2 values as a function of genetic distance in centimorgans for loci pairs located on the same chromosome. The r2 values declined rapidly to 0.2 within 10 cM and to 0.1 within 20 cM for linked loci in wheat.


Figure 2
View larger version (13K):
[in this window]
[in a new window]

 
Figure 2. Genome-wide LD decay among SSR marker pairs as a function of genetic distance (cM). The P values were determined using 1000 permutations. The line corresponds to the population-specific threshold as the 95th percentile of the distribution of r2 as evidence of genetic linkage.

 
The patterns of LD on a chromosomal scale were further evaluated for all chromosomes (Fig. 3). The r2 value corresponding to the 95th percentile of the distribution of those estimates was 0.056 and this value was used as a population-specific threshold for r2 as evidence of genetic linkage. Chromosome arms were populated with well-distributed markers except for 1BL, 1DS, 2BL, 2DL, 4AS, 4BS, 5AS, 5BS, 6AS, 6BS, 7AL, and 7DL. Regions of significant LD that involved loci 10 cM or less apart were present on most of the chromosomes, except 1B, 3D, and 7D. High levels of LD were detected among closely spaced loci physically encompassing the centromeric region of 2A, 2B, 3B, 6B, and 7B. Notable exceptions, however, were observed on 3DL, 4DL, and 6AL (two regions) where large genetic distances (>30 cM) remained in LD in this group of genotypes. Significant LD estimates among unlinked loci were detected in 15 regions on chromosomes 1A, 2A, 3B, 4D, 5A, 5B, 6A, 6D, and 7D. In contrast, several regions with multiple markers spaced less than 4 cM apart did not exhibit LD such as 3DL, 5BL, and 5DL. The pairwise r2 estimates among 197 SSR loci within each chromosome ranged from 0 to 0.547, with a mean of 0.041 and a median of 0.030 (data not shown). The percentage of loci pairs in LD on individual chromosomes ranged from 0% on 1B to 30% on 2D and 7B. The B genome showed the highest proportion of significant LD even though it had the fewest markers.


Figure 3
Figure 3
Figure 3
Figure 3
View larger version (136K):
[in this window]
[in a new window]

 
Figure 3. Wheat consensus map and distribution of LD patterns by chromosome. The gray scale shown in each square corresponds to P value. The original colored display can be accessed from the supplemental Fig. S1 file. Also included on the map are the chromosome locations of genes controlling selected agronomic traits.

 
The estimated locations of several genes controlling agronomic traits were compared to the significant LD regions identified in this study (Fig. 3). Surprisingly, none of the LD regions seemed to be associated with these traits, except the 3DL region that includes genes for kernel color and leaf and stem rust (caused by Puccinia triticina Eriks. and P. graminis Pers.:Pers. f. sp. tritici Eriks. and E. Henn, respectively) resistance. Although the genes associated with seed protein, market class, and disease resistance have been subjected to selection throughout the breeding process, recombination has apparently reduced the LD.


    DISCUSSION
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
SSR Diversity
SSR marker analyses revealed a gene diversity of 0.66 and a mean allele number of 7.2 over 242 loci for this cross-section of U.S. wheat germplasm. These values are relatively higher than previously reported estimates of 6.2 alleles among 40 elite European cultivars over 23 loci (Plaschke et al., 1995), 5.4 alleles and a gene diversity of 0.57 among 68 advanced CIMMYT lines over 47 genomic SSR markers (Dreisigacker et al., 2004), 4.8 alleles over 93 loci among 95 eastern U.S. soft winter wheat lines and cultivars (Breseghello and Sorrells, 2006), and 6.9 alleles and a gene diversity of 0.55 over 70 loci detected among 134 durum accessions representing the elite germplasm from wide geographic regions (Maccaferri et al., 2005). In contrast, Roussel et al. (2004) reported an allele number of 14.5 and a gene diversity of 0.662 among 559 French wheat accessions over 42 polymorphic loci. Using wheat accessions from the IPK gene bank, Huang et al. (2002) reported a gene diversity of 0.77 and 18.1 alleles detected over 26 loci among 998 accessions from 68 countries. The much higher levels of SSR marker diversity found in these studies can be attributed to the use of landraces and diverse germplasm maintained in the gene bank. In our study, the higher gene diversity of this set of cultivars and experimental lines likely reflects the representation of seven market classes in the United States, thus encompassing a wider gene pool, and the use of a large number of SSR markers to detect genetic diversity over the entire wheat genome.

Genetic Relationships and Genetic Structure among U.S. Wheat Germplasm
While known pedigree information can provide a breeding history of germplasm under consideration, errors, missing information, and unknown effects of selection and drift can affect the accuracy of interpretations using pedigree data alone to assess genetic relationships among wheat cultivars (van Beuningen and Busch, 1997; Kim and Ward, 1997; Fufa et al., 2005). The genetic distance-based clustering method separated winter wheat into two clusters but kept all the spring wheat accessions together, even though the group as a whole was more diverse. Within each of these clusters, except for SRW and SWW, five other market classes, HRS, HWS, HRW, SWS, and HWW, did not form a single cluster. These results contrast with the findings of Barrett and Kidwell (1998) involving a set of wheat lines adapted mainly to the Pacific Northwest region. Our studies indicate that the germplasm surveyed from different wheat production regions across the country were quite diverse. While results from genetic distance-based and model-based approaches generally agreed with each other, some differences were noted. The four populations identified by the model-based analysis corresponded largely to major geographic regions of wheat production in the United States indicating that the genetic diversity existing among the U.S. wheat germplasm in this study was likely the result of regional adaptation rather than market class difference, and that the individuals clustered in the same population likely shared related ancestral lines in the breeding history. In this study, Jagger was believed to be a direct descendent of Stephens based on pedigree. Structure analysis assigned them to the same population, both with a probability of >0.80. However, the relationship estimate based on the distance-based method indicated that they have diverged greatly. Also suggested from the pedigree data was the sharing of similar ancestry among HWW lines and SWW–Pacific Northwest lines, thus agreeing with the model-based results. But such relationships were not supported by the distance-based analysis. Taken together, our analysis indicates that the model-based results may be more consistent with the pedigree data. A similar finding was also reported among 260 maize inbred lines (Liu et al., 2003). However, the distance-based results were found to correspond better to the pedigree data among 115 U.S. rice cultivars (Lu et al., 2005). Our study involved a much smaller sample size (43), and samples for some market classes were underrepresented. Therefore, further study including more lines from different production regions is necessary to verify this finding.

Chromosomal Distribution of Linkage Disequilibrium Patterns in Wheat
Genome-wide r2 values declined rapidly to 0.2 within 10 cM and to the baseline significance level within 20 cM for linked loci. Similar or larger estimates of LD were reported for Arabidopsis (Nordborg et al., 2002) and barley (Kraakman et al., 2004), while for sugar beet (Beta vulgaris L.) genotypes, LD was <3 cM (Kraft et al., 2000). In Lolium perenne L., LD was <3.4 cM (Skøt et al., 2005), and in maize LD declined over a distance of 2000 bp (Remington et al., 2001).

The genome-wide LD analysis results also revealed that overall fewer than 5% of marker loci pairs showed significant LD, and that the majority of them involved two independent loci. This could be due, in part, to the small sample size that reduced statistical power and potentially increased the effects of drift. Also, as indicated from the previous studies in maize, genetic relatedness increases LD (Liu et al., 2003; Stich et al., 2005). The wide genetic diversity found among different wheat market classes and the little or no genetic similarity shared between growth habits may have contributed to lower levels of LD detected as well. Structure analysis indicated a moderate degree of population differentiation among the samples studied and the Fst value for one of the four model-based populations, Spring–Northern Plains group, was estimated at 0.39, indicating the presence of population stratification in this group. The identification of substructure among wheat accessions has been used in analyses to reduce the number of spurious associations found in independent loci pairs (Maccaferri et al., 2005; Breseghello and Sorrells, 2006).

For the significant LD detected within chromosomes, consistent with previous findings in both animals and plants, our analysis showed decreasing disequilibrium with increasing distance between markers, an indication of LD maintained by genetic linkage. Most of the LD was observed between loci less than 10 cM apart, and it tended to be sporadically distributed on various chromosomes. However, closely linked markers were frequently not in LD, and LD could also be seen between distant markers. In wheat, the recombination rates tend to be elevated at the distal end of the chromosomes (Akhunov et al., 2003). From our analysis, low-recombination regions near centromeres that were in LD were found on only 5 of the 21 chromosomes. In contrast, significant LD was also found in markers physically located in the distal chromosome bins of 3DL (3DL3-0.81-1.00), 5DL (5DL5-0.76-1.00), and 6AL (6AL8-0.90-1.00). The long-range LD on 3DL and 6AL involved markers more than 40 cM apart, and higher marker density is needed to better define the LD blocks in these regions. Nonetheless, the nonuniform chromosome distribution of LD observed in this study is comparable to the findings in humans (Huttley et al., 1999; Ardlie et al., 2002), maize (Stich et al., 2005), and a previous study in wheat (Breseghello and Sorrells, 2006).

Our analysis indicated that there is extensive variation in the extent of LD throughout the wheat genome, and this nonuniform pattern of LD in the genome is likely the result of selection and genetic drift. We hypothesize that there might be regions of high LD surrounding genes that differentiate the U.S. wheat market classes. However, there was little evidence of LD among this set of genotypes in the regions around the kernel hardness locus on 5DS or the red kernel color genes on the long arms of chromosomes 3A and 3B, with the exception of R-D1 on 3DL. The VRN-A1 gene on 5AL has been reported to be located in a region of reduced recombination (Yan et al., 2003), but LD was not significant in the flanking regions of genes determining the spring and winter growth habit. This might be related to the fact that spring growth habit can be conferred independently by mutations in any of the three copies of the Vrn-1 gene (Vrn-A1, Vrn-B1, or Vrn-D1; Yan et al., 2004; Fu et al., 2005) relaxing the selection pressure. The genes controlling market classes may have been fixed in the ancestral lines many generations ago. Consequently, these genes and their surrounding regions have been subjected to little or no selection pressure during the breeding process and recombination has eroded LD. The LD surrounding R-D1 on 3DL may result from recent intermating of red and white wheat cultivars by breeders which is much more likely than hybridization of hard and soft or spring and winter cultivars. However, we cannot rule out the possibility that insufficient genome coverage and density of markers in some regions contributed to our inability to detect LD in those regions.

Association mapping strategies would benefit from the knowledge of LD patterns. In this study, r2 values decreased with increasing distance between markers, particularly below 10 cM, suggesting that the mapping resolution using this set of genotypes would generally be well below 10 cM for most of the genome. A comparison of the available LD studies in wheat indicates that LD varies widely among different populations, both within and among species (Maccaferri et al., 2005; Breseghello and Sorrells, 2006). Because the extent of LD is likely to be population dependent, thus reducing the predictability in estimating the marker density required for association studies, caution must be used in designing association studies in wheat. Consequently, most association mapping studies will need to focus on either previously identified quantitative trait loci (QTL) intervals and saturate those regions with markers or on candidate genes for traits of interest. The use of populations with more related breeding lines, typically those adapted to the same wheat production region, may increase the levels of LD and facilitate the detection of associations between markers and QTL using fewer markers.


    ACKNOWLEDGMENTS
 
The authors thank Jamie Rust for excellent technical assistance, and Dave Matthews for assistance in querying data from the GrainGenes database. Thanks also go to the wheat breeders participating in the wheat CAP grant who made the wheat cultivars and unreleased germplasm available for this study. This research was supported in part by the funds from the USDA, Cooperative State Research, Education and Extension Service, Coordinated Agricultural Project grant number 2006-55606-16629.


    NOTES
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
All rights reserved. No part of this periodical may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Permission for printing and for reprinting the material contained herein has been obtained by the publisher.

Received for publication June 28, 2006.


    REFERENCES
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 




This article has been cited by other articles:


Home page
Crop Sci.Home page
E. L. Heffner, M. E. Sorrells, and J.-L. Jannink
Genomic Selection for Crop Improvement
Crop Sci., January 28, 2009; 49(1): 1 - 12.
[Abstract] [Full Text] [PDF]


Home page
The Plant GenomeHome page
L. Reddy, T. L. Friesen, S. W. Meinhardt, S. Chao, and J. D. Faris
Genomic Analysis of the Snn1 Locus on Wheat Chromosome Arm 1BS and the Identification of Candidate Genes
The Plant Genome, July 1, 2008; 1(1): 55 - 66.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Chao, S.
Right arrow Articles by Sorrells, M.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Chao, S.
Right arrow Articles by Sorrells, M.
Agricola
Right arrow Articles by Chao, S.
Right arrow Articles by Sorrells, M.
Related Collections
Right arrow Wheat
Right arrow Cell Biology & Molecular Genetics
Right arrow Crop Genetics


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
The SCI Journals Agronomy Journal Vadose Zone Journal
Journal of Natural Resources
and Life Sciences Education
Soil Science Society of America Journal
Journal of Plant Registrations Journal of
Environmental Quality
The Plant Genome