|
|
||||||||
a Pioneer Hi-Bred International, A DuPont Company, 7300 NW 62nd Ave., Johnston, IA 50131-1004, USA
b Dep. of Plant Breeding, Cornell Univ., Ithaca, NY 14853, USA
c USDA-ARS Dale Bumpers National Rice Research Center, Stuttgart, AR 72160, USA
d USDA-ARS Crops Pathology and Genetics Research Unit, Dep. of Agronomy and Range Science, Univ. of California, Davis, CA 95616, USA
* Corresponding author (thtai{at}ucdavis.edu).
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
Molecular markers, such as SSRs, have been widely used in rice germplasm evaluation for both international (Yang et al., 1994; McCouch et al., 1997; Ishii and McCouch, 2000; Ishii et al., 2001) and domestic U.S. (Mackill, 1995; Cao and Oard, 1997; Ni et al., 2002) collections. The use of SSRs to interpret population structure provides much greater resolution than other types of markers because of the high level of polymorphism at SSR loci (Cho et al., 2000; Akkaya et al., 1992). Previous generations of molecular markers were unable to detect enough genetic polymorphism among closely related rice cultivars such as those used in U.S. breeding programs to make them efficient tools for interpreting population structure (Mackill, 1995). However, SSR markers are well suited to the task. In rice, the highly polymorphic nature of SSR motifs is coupled with a low level of homoplasy observed in O. sativa cultivars (Chen et al., 2002), providing an appropriate tool for population genetic studies.
With the public availability of rice genome sequence information, there is growing interest in identifying and characterizing genes associated with both qualitative and quantitative forms of phenotypic variation. Of particular interest to rice breeders is the possibility of using existing germplasm resources for gene and allele discovery on the basis of association mapping strategies (Kruglyak, 1999; Jorde, 2000; Farnir et al., 2000). Understanding population structure is important to avoid identifying spurious associations between phenotype and genotype in association mapping (Pritchard and Rosenberg, 1999; Pritchard et al., 2000; Pritchard and Donnelly, 2001).
This is the most comprehensive study to date assessing population structure in U.S. rice, taking advantage of the ease and reproducibility of SSR allele calling with a high throughput capillary based system (Rhodes et al., 1998; Ponce et al., 1999; Coburn et al., 2002). The 145 accessions evaluated here represent the majority of rice cultivars that have been released in the USA during the 20th century. The specific objectives of this study were (i) to analyze population structure in U.S. rice using both genetic distance-based and model-based clustering methods, (ii) to determine whether the population structure can be attributed to modern U.S. rice breeding efforts (1930present) or whether it predates this history, and (iii) to explore U.S. rice breeding patterns, if any, by examining the genetic relationships of rice cultivars developed at different time periods in the 20th century.
| MATERIALS AND METHODS |
|---|
|
|
|---|
|
SSR Markers
A total of 169 previously developed SSR markers (Table 2) were used for genotyping (Akagi et al., 1996; Chen et al., 1997; Temnykh et al., 2000, 2001; McCouch et al., 2002). Markers were distributed along the 12 chromosomes with an average distance between markers of approximately 9 cM and an average of 14 markers per chromosome. Primer sequences for these markers can be found on the Gramene website (www.gramene.org; verified 1 September 2004).
|
Statistical Analysis
Genetic distance and cluster analyses were conducted using the PowerMarker program (http://www.powermarker.net; verified 1 September 2004). Nei's genetic distance (1972) was used to calculate pair-wise genetic distance among all accessions. The UPGMA method was used to conduct cluster analysis. The TreeView program distributed at http://taxonomy.zoology.gla.ac.uk/rod/rod.html (verified 1 September 2004) was used to construct clustering trees. Model-based cluster analysis was performed by the Structure program (Pritchard et al., 2000), which detects population structure in structured or admixed populations. The number of subpopulations (K) was set from 2 to 8, and each was run three times. Each run started with 10000 burn-ins followed by 50000 iterations. When K was set at 5, a run with the highest log likelihood was achieved and was used to produce model-based population structure. The polymorphism information content (PIC) for each marker was calculated (Anderson et al., 1993):
![]() |
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
The average observed heterogeneity of the total sample across all 169 loci was 3.1%, which was as expected. However, five loci (RM44, RM144, RM221, RM341, and RM1189) across the total sample and three accessions (Delitus, Colusa, and IR659-10-8-3) across the total loci had observed heterogeneity
10%. Notably, IR659-10-8-3 (T018) had as high as 46% of observed heterogeneity, indicating that this sample was not stable or purified yet.
Groupings of USA-Developed and Important Introduced Cultivars at Different Time Periods
To explore patterns in U.S. rice breeding history, we classified the 145 rice cultivars into four groups according to the time period they were first introduced to the USA or released for production (Table 1). During the first period (early 1900s1929), 18 cultivars were imported from Asia and crosses were initiated from these introductions. Five of these cultivars were brought directly from the Philippines, China, and Madagascar, two were from unknown sources, and the remaining 11 were selections from heterogeneous parental accessions, most of which were collected in Asia. During the second period (19301959), extensive crosses were made among the first generation of cultivars and an additional 31 cultivars were developed or introduced. The third period (19601979) was marked by the introduction of semidwarf germplasm. Although this material was widely used in rice breeding programs, few of the 44 cultivars released during this period were semidwarf. Many of the 52 cultivars released during the fourth time period (19802000) time were semidwarf or of short stature.
The 18 cultivars imported or released during the first period (19001929, T1) were classified into three groups (Fig. 1) . The first group (T1G1) consisted of one cultivar, Early Wataribune (EYWB), from California and five cultivars from China and the Philippines. They all shared the characteristic of short grains. Colusa (COLU) was selected from the cultivar Chinese (CHNA). It should be noted that CHNA used in this study and cited in Dilday (1990) may or may not be the cultivar from which COLU was developed as the reported origin of the CHNA cultivar from which COLU was selected is Italy (Johnston, 1958), while the CHNA used in our study is from China. Nevertheless, the results of our analysis suggest that the CHNA used in this study is indeed closely related to COLU. Caloro (CALO) was selected from EYWB. This group formed the foundation for rice breeding in California, which is the only region in the USA that grows temperate japonica (TMJ). The second group (T1G2) consisted of eight accessions, three of which can be traced back to Blue Rose (BROS) or its improved versions and they were grouped closely together. Edith (EDTH) was clustered closely with Honduras (HNDS), from which it was selected. Delitus (DLTS) is joined with this group at a greater genetic distance. T1G2 was selected by breeders in the southern states, mainly Louisiana, and formed the foundation for U.S. medium-grain tropical japonica (TRJ-M). The third group (T1G3) consisted of four accessions, all of which have long grains and three of which were selected by breeders in Louisiana. Nira (NIRA) is loosely joined with this group at a greater genetic distance. This group formed the foundation for U.S. long-grain tropical japonica (TRJ-L), which is the major type of rice in the southern U.S. rice belt. On the basis of SSR analysis, the three groups (TMJ, TRJ-M, and TRJ-L) were already differentiated, genetically, by 1929, which suggests that they were derived from existing subpopulations in Asia. During the first three decades of the 20th century, U.S. breeders did not use or develop any indica germplasm.
|
The 44 cultivars in the third period were classified into four groups (S-Fig. 2; available online). Group 1 (T3G1) consisted of 11 cultivars, of which California contributed 80%. All were of the short to medium grain-type. Group 2 (T3G2) had 12 medium grain cultivars developed by southern states, primarily by Arkansas. However, DLTS was loosely joined to this group at a greater genetic distance. Group 3 (T3G3) consisted of 17 new long grain (TRJ-L) cultivars developed by southern states. NIRA stands alone in the intermediate position between TMJ and TRJ. Group 4 (T3G4) contained two collected indica and L110, a Louisiana cultivar selected from TN1/H4, where H4 was introduced from Sri Lanka as PI275451 (http://www.ars-grin.gov/cgi-bin/npgs/html/acchtml.pl?1206433; verified 1 September 2004). Cultivar K65, a collection from Surinam, stands alone between the indica and japonica, it was thus called an interspecies group.
The fourth time period (19802000) contained 52 new cultivars, which were clustered into four groups (S-Fig. 3; available online). Group 1 (T4G1) included 10 short to medium grain cultivars developed in California and representative of U.S. TMJ for that time period. The second group (T4G2) contained seven medium grain cultivars, all developed by southern states and belonging to TRJ-M. Group 3 (T4G3) had 34 new long grain cultivars. This group contained U.S. TRJ-L and California long grain japonica. Group 4 (T4G4) included only one indica, TeQing (TQNG), which was collected from China and has been used in recent years for introgression of its high yielding alleles. This group is substantially distinct from the other japonica groups.
Cluster analysis clearly shows that released U.S. rice cultivars can be classified into three groups: the TMJ group, short season, cold tolerant temperate japonica, mostly developed and used in California with short or medium grains; the TRJ-M group, which has medium grain length; and the TRJ-L group, which has long grain length and remains the predominant type in the southern USA. Indica germplasm has been utilized as donors of desirable genes such as those conferring semi-dwarf stature, disease resistance, and high yield to improve U.S. japonica cultivars but has never been used directly in production. The genetic structure of temperate and tropical japonica is clearly distinguished throughout the history of U.S. rice breeding in the 20th century. This difference is paralleled by the clear distinction between long and short-to-medium grain cultivars. U.S. temperate japonica are consistently associated with short to medium grains and U.S. tropical japonica are divided into two well-defined subgroups: one associated with medium grains and the other with long grains. Indica cultivars are always substantially different from japonica cultivars.
A number of papers regarding the classifications of U.S. rice cultivars have been published. For example, Cao and Oard (1997) analyzed 26 U.S. elite rice cultivars and lines with 69 RAPD (random amplified polymorphic DNA) primers. They found that cultivars with the same maturity group or grain type were generally placed together in RAPD-based cluster analysis. The present SSR-based groupings are consistent with maturity group and grain type as well. Ni et al. (2002) evaluated the genetic diversity of 38 diverse rice cultivars (O. sativa) and two wild species accessions (O. rufipogon Griffith and O. nivara Sharma et Shastry) using 111 SSR markers. They classified the japonica accessions into two groups: tropical japonica and temperate japonica. While these studies provided general information about U.S. rice cultivar classifications, the relatively small number of U.S. accessions and/or small number of markers included limit their resolution and power. In this study, examination of 169 SSR markers revealed that clear and consistent population structure exists in the 145 rice cultivars used in or developed by U.S. breeding programs.
Relationship between Genetic Distance-Based Groupings and Pedigrees
There were several subgroups (Fig. 2)
within each of the three groups of U.S. japonica cultivars. Cultivars in the same subgroup usually shared a high proportion of ancestry and/or agronomic characteristics such as maturity and disease resistance. Close scrutiny of each subgroup provides useful information for cultivar development. TMJ consisted of at least two major subgroups. Sub1-1 included nine accessions, eight of which were collected or selected from Asian cultivars, with only one (Nortai, NTAI) developed in the USA. In general, they are not actively used in today's U.S. rice breeding programs, so this subgroup is referred to as "Old Cultivars." Sub1-2 had 24 cultivars, all of which were derived from CALO. Within this subgroup, CS-M3 (CSM3) is widely used as a parent in recent and contemporary breeding programs, but its pedigree can be traced back to CALO. This subgroup is therefore referred to as "Caloro & CS-M3" subgroup in Fig. 2. Cultivars in Sub1-2 shared at least 77% ancestry with TMJ (S-Table 1) and all had early season maturity. Sub3-4 and 3-5 all contain the semidwarf gene, sd1, or have very short stature, and are high yielding and early maturing. Other subgroups also were identifiable with their most common-shared ancestor, such as Sub2-2 (Lacrosse), Sub3-1 (Starbonnet), and Sub3-2 (Fortuna) (Fig. 2). All cultivars except the Old Cultivars subgroup in TRJ-M shared BROS in their pedigrees; therefore, BROS was the most important ancestor for TRJ-M group. All subgroups in TRJ-L shared Rexoro (RXOR) in their pedigrees with different percentages, and most of them also shared Fortuna (FRTA) in their pedigrees. Therefore, RXOR and FRTA were the two primary contributors to TRJ-L as reported by Dilday (1990) and Mackill and McKenzie (2003). However, more recently Starbonnet (STBN), Bluebelle (BBLE), and the long grain California cultivars L201 and L202 have been used repeatedly as parents, underwriting the subgroup identity of this group.
|
This study included five sets of full sibs: Cypress-Jodon (L202/Lemont), Lemont-Gulfmont (Lebonnet//CIor 9881/IR659-10-8-3), Maybelle-Jackson (Skybonnet/L201), Nova-STG533187 (Lacrosse//Zenith/Nira), and Bluebonnet 50-Sunbonnet (selections from Bluebonnet). On the basis of their pedigrees, each pair was expected to group together; however, this was not always the case. Marker-based clustering trees provided insight into the true genetic makeup of each accession. Divergence of a pair of full sibs may reflect strong divergent selection by breeders. For example, Cypress (CPRS) clustered with the L202 subgroup, as might be expected, but Jodon (JODN) shared the BBLE subgroup with Lemont (LMNT) (Fig. 2). This could be attributed to different selection criteria imposed by Louisiana breeders that exploited inherent genetic differences between L202 and LMNT. The Structure program revealed that LMNT and L202 shared 98 and 84% of TRJ-L ancestry, respectively (S-Table 1), so they were possibly divergent for the other 18% of their ancestry. In other cases, such as Jackson (JKSN) and Maybelle (MBLE), full-sibs were closely clustered together, reflecting the high similarity between their parents (L201 and SKBT, which both shared approximately 99% ancestry of TRJ-L) (S-Table 1).
While the SSR-based classifications were consistent with cultivar pedigrees in our study, some discrepancies were observed. In some cases, reasons for the discrepancies have been proposed, as with the full-sibs as discussed earlier. However, accession T018 was classified into the indica group instead of the tropical japonica group as expected (Fig. 2). T018 was developed at IRRI, but its pedigree reveals that it was selected from backcross progenies of BBLE/TN-1 backcrossed five times to BBLE. As a result, it was expected to have approximate 99% of its genome from BBLE, a tropical japonica long grain cultivar (TRJ-L). However, the Structure program revealed that T018 had only 22% ancestry from japonica and 62% from indica, and therefore it was classified as an indica (S-Table 1). While backcrossing followed by intensive selection can shift allele frequencies from the expected percentage based on pedigree predictions, this is an extreme example and might be related to its high percentage (46%) of observed heterogeneity.
Breeding Patterns in U.S. Rice Germplasm Development
This study contained a total of 115 rice cultivars that were developed in the USA during the period 1930 to 2000. Among the 115 cultivars, 24 (21%) were TMJ, 22 (19%) were TRJ-M, and 69 (60%) were TRJ-L (Table 3). An additional 30 cultivars were collected from other countries and brought to the USA as a source of germplasm enhancement, or directly selected from those collections.
|
Interestingly, while group 2 (TRJ-M) combines the medium grain characteristic of TMJ in group 1 and many of the tropical japonica characteristics of group 3, cultivars from group 2 were most frequently used as parents for the improvement of other groups and correspondingly, group 2 cultivars were also derived from a higher proportion of crosses with other groups. Therefore TRJ-M may be regarded as providing a genetic bridge between TMJ and TRJ-L.
Comparisons between Genetic Distance-Based and Model-Based Groupings
While genetic distance-based (GD-based) clustering is powerful, easy to use, and has been widely reported in the literature, the problem with this approach is that the number of groups identified is based on an arbitrary cutoff that depends on the user's judgment, with no standard way of evaluating the statistical significance of the grouping result. Model-based methods, such as that reported by Pritchard and colleagues (Pritchard et al., 2000; Pritchard and Donnelly, 2001), use a Bayesian clustering approach in which each group or population is characterized by a set of allele frequencies at each locus along with the likelihood for each K (number of groups). This enables users to choose the number of groups with the highest log likelihood. Model-based methods have been widely used to identify population structure for association mapping in human genetics (Pritchard and Rosenberg, 1999; Pritchard and Donnelly, 2001; Pritchard and Przeworski, 2001; Rosenberg et al., 2002) and, to a lesser extent, plant genetics (Remington et al., 2001; Thornsberry et al., 2001). In this study, we used both genetic distance-based and model-based methods to assess population structure.
Groups resulting from the two methods were consistent for 139 (96%) cultivars (Fig. 2), indicating a good consensus between the two methods. In Structure analysis, comparative runs showed that the highest log likelihood was achieved when K was set at 5. Therefore, both methods classified the 145 rice cultivars into five groups as summarized in the previous section. K65, a genetically distinct cultivar, clustered independently from either indica or tropical japonica, indicating that this accession was different from both the indica and japonica subspecies or the admixed pedigrees of the two subspecies. This result agreed with GD-based analysis (Fig. 2). NIRA, an early selection from Louisiana, had 34, 26, and 40% of ancestry from TRJ-L, TRJ-M, and K65, respectively (S-Table 1), as estimated by Structure. Thus, it appeared most similar to K65, but clearly showed ancestry with both long and medium grain tropical japonica groups. The other three accessions (WC-6, Badkalamakati, and LTSL) grouped together with K65 by Structure were also old introduced cultivars and showed conflicting classifications between the GD- and model-based methods. DLTS, an accession that showed conflicting classifications, has 47% of TRJ-M, 39% of K65, and 14% of TRJ-L, and it was classified as either TRJ-M (Time 1 and 3 in Fig. 1 and S-Fig. 2) or TRJ-L (Time 2 in S-Fig. 1). Therefore, accessions sharing a high percentage (>39%) of ancestry with K65 cannot be reliably classified into either TMJ or TRJ groups, though they may share ancestry with one or both.
On the basis of our results, we conclude that both GD-based and model-based grouping methods worked equitably well; however, for a data set with clear pedigree relationships like the one in this study, the GD-based method generated results more consistent with pedigree information than the model-based method. Unlike the GD-based method, the model-based method provided each entry's percentage of ancestry, which is valuable for breeders. However, the model-based method does not provide information about subgroups, as is provided by the GD-based method.
Understanding the population structure of U.S. rice germplasm is a prerequisite for future studies aimed at association mapping where regions of the genome can be associated with phenotypes of interest. The ability to use genetic resources familiar to the rice breeding community as the foundation for establishing phenotypic-genotypic associations offers exciting opportunities for gene discovery but will only be efficient if population structure is taken into account (Flint-Garcia et al., 2003).
| ACKNOWLEDGMENTS |
|---|
| NOTES |
|---|
|
|
|---|
1 Mention of trade names or commercial products in this publication is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the U.S. Department of Agriculture. ![]()
Received for publication December 18, 2003.
| REFERENCES |
|---|
|
|
|---|
Related articles in Crop Science:
This article has been cited by other articles:
![]() |
S. Chao, W. Zhang, J. Dubcovsky, and M. Sorrells Evaluation of Genetic Diversity and Genome-wide Linkage Disequilibrium among U.S. Wheat (Triticum aestivum L.) Germplasm Representing Different Market Classes Crop Sci., May 31, 2007; 47(3): 1018 - 1030. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. E. Giarrocco, M. A. Marassi, and G. L. Salerno Assessment of the Genetic Diversity in Argentine Rice Cultivars with SSR Markers Crop Sci., March 1, 2007; 47(2): 853 - 858. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. C. Eizenga, H. A. Agrama, F. N. Lee, W. Yan, and Y. Jia Identifying Novel Resistance Genes in Newly Introduced Blast Resistant Rice Germplasm Crop Sci., July 25, 2006; 46(5): 1870 - 1878. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Breseghello and M. E. Sorrells Association Analysis as a Strategy for Improvement of Quantitative Traits in Plants Crop Sci., April 25, 2006; 46(3): 1323 - 1330. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Breseghello and M. E. Sorrells Association Mapping of Kernel Size and Milling Quality in Wheat (Triticum aestivum L.) Cultivars Genetics, February 1, 2006; 172(2): 1165 - 1177. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| The SCI Journals | Agronomy Journal | Vadose Zone Journal | |||
| Journal of Plant Registrations | Soil Science Society of America Journal | ||||
| Journal of Natural Resources and Life Sciences Education |
Journal of Environmental Quality |
||||