|
|
||||||||
a Univ. of Georgia, Center for Applied Genetic Technologies, 111 Riverbend Road, Athens, GA 30602-6810
b Pioneer Hi-Bred Int'l, Inc., Crop Genetics Research and Development, 19456 St. Hwy 22, Mankato, MN 56001
* Corresponding author (rboerma{at}arches.uga.edu).
| ABSTRACT |
|---|
|
|
|---|
0.01) associations. In the PI97100 x Coker 237 population, two (cqProt-001 and cqProt-002) of four previously described QTL for seed protein, two (cqOil-001 and cqOil-002) of three QTL for oil content, and none of three QTL for seed weight were confirmed in the independent population. In the Young x PI416937 population, none of the three previously reported QTL for protein was confirmed. One (cqOil-003) of three QTL for oil content and two (cqSd wt-001 and cqSd wt-002) of three QTL for seed weight were verified. The unconfirmed QTL may have been false positive or they may have been specific for the sample of lines used in the original populations. These results confirm the necessity of validating QTL in parallel populations before utilizing them in a plant improvement program.
Abbreviations: ANOVA, analysis of variance cM, centimorgan cqQTL, confirmed quantitative trait locus LG, linkage group MAS, marker-assisted selection QTL, quantitative trait locus/loci RFLP, restriction fragment length polymorphism SSR, simple sequence repeat
| INTRODUCTION |
|---|
|
|
|---|
Brummer et al. (1997) identified QTL for soybean seed protein and oil content using eight distinct populations. They reported that the phenotypic effect of some QTL was sensitive to the environment in which they were evaluated, but did detect environmentally stable QTL. In maize (Zea mays L.), results from three independent experiments repeated in the same genetic background revealed that the QTL identified were not consistent (Beavis et al., 1994; Beavis, 1994). Confounding factors such as population structure, sources of parental lines, different sets of environments, and sampling of progeny were reported as possible causes for this discrepancy (Beavis, 1994). In another maize study, Ajmone-Marsan et al. (1996) evaluated previously identified QTL for grain yield in an independent sample drawn from the same population. They found that two QTL were consistent with those detected in the previous experiments, but two QTL identified in the first sample remained undetected in the independent sample. Melchinger et al. (1998) genotyped two independent samples from the same F2 maize population using RFLP markers. For grain yield and other agronomically important traits, they detected a total of 107 QTL from the first sample and 39 QTL from the second independent sample. They found that only 20 QTL were common in both population samples.
In contrast to QTL detection studies, soybean qualitative genetic studies require a hypothesis generation and a second or confirming generation to assign a gene symbol (Soybean Genetics Committee, 1997). This second generation can be progeny of the hypothesis generation or progeny of a testcross. However, this confirmation step has not been required in QTL mapping studies in soybean or other species (Boerma and Mian, 1999). Since there is limited and conflicting information confirming the reported QTL in soybean, it is important to conduct validation experiments before the development of breeding strategies based on unconfirmed QTL reported in the literature. Furthermore, a number of QTL mapping studies have not used multiple environments or populations for the collection of phenotypic data. Results from QTL confirmation experiments will generate new knowledge about the limitations and strengths of QTL utilization in a MAS project. The plethora of QTL data will serve the modern plant improvement programs and future genomic studies only when these QTL are proven to be real.
Another important issue in the application of reported QTL is the ability to utilize closely linked markers of the same marker type or other marker types for the actual selection. Some marker types, such as RFLP, have limited polymorphism in elite soybean breeding populations, which may result in an incapability to employ the RFLP marker used to initially map the QTL in another population. In addition, more cost effective DNA marker systems, such as SSRs, have been developed since many of the original QTL discoveries. Therefore, the ability to select a closely linked marker, other than the original marker that identified the QTL, is often required to utilize previously reported QTL.
In soybean, there are a number of important QTL mapping studies for seed protein, oil content, and seed weight (Diers et al., 1992b; Lee et al., 1996b; Mansur et al., 1993, 1996; Mian et al., 1996; Maughan et al., 1996; Brummer et al., 1997; Qiu et al., 1999; Sebolt et al., 2000; Hoeck et al., 2003). Soybean seed is a major source of protein for animal feed and oil for human consumption. Simultaneous increases in protein and oil content can proceed only to a limited extent since most experimental data show that protein and oil content are negatively correlated (Burton, 1987). Intense breeding efforts have resulted in the selection of two types of seed composition, those with a higher percentage of protein content but lower oil, and those with a higher percentage of oil content but lower protein (Miller and Fehr, 1979; Brim and Burton, 1979; Burton and Brim, 1981; Burton, 1985; Wilcox, 1985). Seed weight, measured as mass per seed, is an important yield component of soybean and is generally positively correlated with seed yield (Burton, 1987). Soybean cultivars with either very small or very large seed weights are used in the production of many specialty human foods. The demand for these food-type soybeans is steadily increasing in the global market at a rate of 3 to 5% per year. Sales of the food-type soybeans have increased by 450% in the last 18 yr (Wilson, 1999).
One objective of this study was to confirm or refute previously reported QTL for seed protein, seed oil, and seed weight in an independent population of PI97100 x Coker 237 with the same RFLP markers that originally identified the QTL location. The second objective was to verify or refute previously reported QTL in an independent population of Young x PI416937 for the same seed traits using SSR markers mapped in the same region as the original RFLP markers.
| MATERIALS AND METHODS |
|---|
|
|
|---|
In 1996, 176 F2:4 soybean lines of the PI97100 x Coker 237 population and 176 F2:4 soybean lines of the Young x PI416937 population (four of the original 180 F2derived lines from each population were not included because of seed limitations) were grown at the Univ. of Georgia Plant Sciences Farm near Athens, GA (Athens 96) and the Univ. of Georgia Southwest Branch Experiment Station near Plains, GA (Plains 96). At both locations, the two populations were grown in separate experiments. The soil type at Athens was Appling coarse sandy loam (clayey, kaolinitic, thermic Typic Hapludults), whereas the soil type at Plains was Greenville sandy clay loam (clayey, kaolinitic, thermic Typic Rhodic Paleudults). The experimental unit for each entry was two 4-m rows spaced 0.76 m apart. To control the effects of soil heterogeneity, the lines in each population were randomly assigned to four sets of 44 lines for a total of 176 F2derived lines. Each set included three entries of the male and the female parent (total of 50 entries per set) and was planted in a randomized complete block design with two replications. The sets were randomized within the replications. At maturity each plot was harvested with a plot combine.
For protein and oil content determination, a 50-g seed sample from each plot was sent to the USDA-ARS National Center for Agricultural Utilization Research at Peoria, IL. An 18- to 20-g sample of seed was analyzed for protein and oil composition with a model 1255 Infratec NIR food and feed grain analyzer (Ultra Tec Manufacturing, Inc., Santa Ana, CA). The protein and oil values were converted to a moisture-free basis. The seed weight for each plot was determined on the basis of a 100-seed sample. For each environment and confirmation population the four individual test means for protein, oil, and seed weight did not significantly differ (P > 0.05) based on t tests using the error variances from each test.
The phenotypic data for protein, oil, and seed weight were analyzed by analysis of variance (ANOVA) with the Agrobase software (Agronomix Software Inc., Winnipeg, Canada). For all statistical models, replications, environments, and genotypes were considered random effects.
Marker Data Collection and Statistical Analysis
In both populations, young trifoliolate leaves from 12 plants of each line (two replications) were sampled for DNA extraction from the 1995 hill-plot experiments after the hill plots were thinned. The DNA isolation, restriction enzyme digestion, electrophoresis, southern blotting, and hybridization procedures were performed according to Lee et al. (1996a)(1996b). Previously reported RFLP markers (Mian et al., 1996; Lee et al., 1996b) associated with seed protein, seed oil, and seed weight QTL were utilized in the confirmation population of PI97100 x Coker 237. The following RFLP loci were evaluated for seed protein content: E/A454-1 (where E/A454-1 refers to RFLP marker A454-1 located on Linkage Group E, other RFLP markers are annotated similarly), K/A065-1, UNK/A132-4, H/A566-2; seed oil: C1/A063-1, G/L154-2, H/A566-2; and seed weight: D2/A257-1, G/A235-1, M/Cr529-1. For the Young x PI416937 population, SSR markers were selected from the NC113 linkage map, developed by mapping SSR markers in a F4derived population of Young x PI416937 (Narvel et al., 2004), and the consensus soybean map (Cregan et al., 1999) in the same genomic region as the RFLP markers identified previously to be associated with QTL for seed traits. The following SSR loci were evaluated for seed protein: K/Satt441 (where K/Satt441 refers to SSR marker Satt441 on LG K, other SSR markers are annotated similarly) and K/Satt559 for RFLP K/A199-1; N/Satt530 and N/Sat_084 for RFLP N/A071-2; C1/Satt338 and C1/Satt180 for RFLP C1/gac197-1; seed oil: D2/Satt208 and D2/Satt311 for RFLP D2/cr142-1; L/Satt398 and L/Satt313 for RFLP L/A023-1; J/Satt380 and J/Satt244 for RFLP J/B122-1; and seed weight: G/Satt303 for RFLP G/B031-1n; E/Satt263 for RFLP E/Blt049-2n; C1/Satt396 for RFLP C1/A059-1.
For the SSR marker detection, PCR reactions were prepared by the protocol by Diwan and Cregan (1997). The reactions were performed in a dual 384-well and 96-well GeneAmp PCR System 9700 or a 384-well ABI 877 robotic thermal cycler (PE-ABI, Foster City, CA). The cycling program consisted of 1 min at 95°C, followed by 32 cycles of 25 s for denaturation at 94°C, 25 s of annealing at 46°C, and 25 s of extension at 68°C. At the end of the cycling procedure, the reaction mixtures were held at 4°C. Electrophoresis was run on an ABI-Prism 377 DNA Sequencer (PE-ABI, Foster City, CA) with 120-mm plates at 750 V for 2 h. Lanes were loaded on a 4.8% (w/v) acrylamide:bisacrylamide (19:1) gel with KLOEHN micro-syringes (Kloehn Ltd., Las Vegas, NV). Genescan (Version 3.0) was used to analyze DNA fragments, which were scored with Genotyper (Version 2.1).
The phenotypic data from the F2derived lines were analyzed for the appropriate RFLP and SSR markers. Single-factor ANOVA was used to determine the significance (P
0.01) among the marker genotypic class means using an F-test from the Type III mean squares obtained from the GLM procedure (SAS Institute, 1992). The mean seed protein content, oil content, and seed weight across years and locations as well as across individual environments, were compared for the lines homozygous for the male parent allele and the lines homozygous for the female parent allele for each marker identifying a QTL. Previously reported QTL were assumed to be confirmed if the means of these two groups were significantly different (P
0.01) and the parental alleles produced a similar effect as in the original mapping studies.
| RESULTS |
|---|
|
|
|---|
|
0.01) on the basis of the combined analysis over the three environments (Table 1). The RFLP markers E/A454-1 and UNK/A132-4 were found to be associated with seed protein (P
0.0001 and P
0.0116, respectively). The QTL at E/A454-1 was significant in all three environments (Athens 95, Athens 96, and Plains 96), whereas the UNK/A132-4 QTL was significant (P
0.01) in Plains 96 and approached significance (P
0.1) in Athens 95 and Athens 96 (Table 1).
|
The H/A566-2 locus was not significant in any environment in our study. In addition, the previously identified K/A065-1 marker was not detected in our confirmation population of PI97100 x Coker 237. In the original mapping study, the K/A065-1 marker was detected only in one of the two environments evaluated and it had a large effect (R2 = 21%) (Lee et al., 1996b).
Seed Oil Content
The seed oil content of the F2derived lines showed continuous variation (Fig. 2)
. Combined analysis over the three environments (Athens 95, Athens 96, and Plains 96) showed that PI97100 and Coker 237 differed by 28 g kg1 in seed oil content, with Coker 237 having 16% higher oil content than PI97100. The seed oil of the progeny lines ranged from 169 to 197 g kg1 and the mean oil content of the population was 185 g kg1 (Fig. 2).
|
0.0011 and P
0.0008, respectively) in the confirmation population of PI97100 x Coker 237. Consistent with the original mapping study, we found the PI97100 allele to be associated with increased oil at the C1/A063-1 locus, whereas for the H/A566-2 locus, the Coker 237 allele was associated with increased oil. The C1/A063-1 and H/A566-2 QTL each accounted for 8 and 8.3% of the total phenotypic variation for seed oil, respectively (Table 1). The confirmed C1/A063-1 oil QTL was named as cqOil-001 and the confirmed H/A566-2 oil QTL was designated as cqOil-002. The G/L154-2 oil QTL was not detected to be significant in our study. Lee et al. (1996b) reported that at the G/L154-2 locus, lines with both marker bands had a higher seed oil percentage than homozygous lines. The G/L154-2 locus may have exhibited pseudo-overdominance (i.e., repulsion phase linkage of the QTL) (Fasoula and Fasoula, 1997). The G/L154-2 locus explained 21% of the total phenotypic variation, but it was detected in only one environment. In the confirmation population of PI97100 x Coker 237, the G/L154-2 locus was not detected to be significant in any environment and there was no evidence of pseudo-overdominance (Table 1).
Relationship between Seed Protein and Oil Content
In soybean, seed protein and oil contents have been reported to be negatively correlated (Burton, 1987). In this experiment, negative phenotypic correlations between seed oil and protein were observed in both 1995 and 1996 (r = 0.64 and r = 0.55, respectively). The negative association for protein and oil contents is in agreement with earlier studies (Johnson and Bernard, 1962; Kwon and Torrie, 1964; Smith and Weber, 1968). In some mapping studies, the association between these two traits was explained by QTL conditioning both traits (Diers et al., 1992b; Mansur et al., 1993). We analyzed all the RFLP markers identifying protein and oil QTL in Table 1 for both seed protein and seed oil content, and we did not detect any common QTL for protein and oil content (data not shown). The QTL linked to H/A566-2 locus was associated with both protein and oil in the original mapping study (Lee et al., 1996b). In our study, H/A566-2 (cqOil-002) was detected to be significant only for seed oil content. The C1/A063-1 marker (cqOil-001) was confirmed to be associated with oil content. This marker has also been reported to be significantly associated with seed protein content in other populations (Brummer et al., 1997). In addition, E/A454-1 marker (cqProt-001) was significantly associated with seed protein content in both the mapping and the confirmation population of PI97100 x Coker 237, and it is also reported to be associated with seed oil content (Diers et al., 1992b).
Seed Weight
The mean seed weight of the 176 F2derived lines in the PI97100 x Coker 237 population showed a continuous distribution (Fig. 3)
. Combined analysis over the three environments (Athens 95, Athens 96, and Plains 96) showed that Coker 237 had 4% larger seed weight than PI97100, but the difference was not statistically significant. The seed weight of the F2derived lines ranged from 133 to 196 mg seed1 and the mean seed weight of the population was 161 mg seed1. The progeny exhibited transgressive segregation for both larger and smaller seed weight than the parents (Fig. 3).
|
Young x PI416937 Population
Seed Protein and Oil Content
The F2derived lines of Young x PI416937 showed continuous variation for seed protein and seed oil content. Combined analysis over the two environments (Athens 96 and Plains 96) indicated that the mean seed protein was 467 g kg1 for PI416937 and 434 g kg1 for Young, with PI416937 having 8% higher protein than Young. The protein content of the progeny lines ranged from 392 to 480 g kg1.
Three independent QTL for seed protein that explained more than 10% of the phenotypic variation in the original mapping study (Lee et al., 1996b) were tested in the confirmation population of 176 F2:4 lines. The SSR markers linked to the RFLP marker that detected the QTL were chosen from the NC113 soybean mapping population and the soybean genetic linkage map (Cregan et al., 1999; Narvel et al., 2004) (Table 2). Single-factor ANOVA revealed that none of the SSR loci was confirmed to be associated with seed protein based on the combined analysis over two environments (Table 2). Since in the mapping study of Young x PI416937 the three QTL were detected in three different environments, the effect of these QTL was probably dependent upon the specific sample of lines used in the population (limited sample size).
|
0.01) in the confirmation population of Young x PI416937. They explained 8 and 7% of the phenotypic variation, respectively (Table 2). Consistent with the original mapping study, the PI416937 allele was associated with increased oil content for both L/Satt398 and L/Satt313 loci. We have designated this confirmed oil QTL as cqOil-003. The other two QTL for seed oil content were not confirmed in our independent population of Young x PI416937 (Table 2).
Seed Weight
The mean phenotypic data for seed weight across two environments (Athens 96 and Plains 96) indicated that PI416937 and Young differed by 28 mg seed1, with PI416937 having 17% larger seed weight. PI416937 averaged 193 mg seed1 and Young averaged 165 mg seed1. The progeny exhibited transgressive segregation for both larger and smaller seed weight and ranged from 131 to 234 mg seed1. Three independent QTL for seed weight with R2 > 10% that were detected in the mapping population (Mian et al., 1996) were tested in the confirmation population of Young x PI416937. Single-factor ANOVA across two environments indicated that SSR markers G/Satt303 and E/Satt263 were confirmed to be associated with seed weight (P
0.01) in the confirmation population of Young x PI416937 (Table 2). They explained 8 and 18% of the phenotypic variation, respectively. The confirmed G/Satt303 and E/Satt263 QTL for seed weight were designated as cqSd wt-001 and cqSd wt-002, respectively. For both QTL, G/Satt303 (cqSd wt-001) and E/Satt263 (cqSd wt-002), the PI416937 allele was associated with larger seed weight (Table 2). The SSR marker C1/Satt396 was not detected to be associated with seed weight.
| DISCUSSION |
|---|
|
|
|---|
Some QTL (i.e., K/A065-1 and G/L154-2, Table 1) were detected in only one location in the original mapping studies (Lee et al., 1996b; Mian et al., 1996). These QTL were not detected in the confirmation population. In addition, some QTL for protein and seed weight that were detected in three locations in the original mapping studies could not be verified in the confirmation populations. There are a couple of explanations for the inability to confirm the previously reported QTL. The unconfirmed QTL may have been false positive in the original mapping population (Type I error). The original mapping populations used a relaxed probability level of P
0.05 to test markers for significant associations; therefore, some reported QTL were probably Type I error. Alternatively, the unconfirmed QTL may have been detected because of the limited or specific sampling of lines used in the original mapping population. The mapping population of PI97100 x Coker 237 consisted of 111 F2derived lines and the one of Young x PI416937 consisted of 120 F4derived lines. In addition, the unconfirmed QTL could be environmentally sensitive QTL as was found for protein and oil in the Brummer et al. (1997) study.
Lande and Thompson (1990) reported that the QTL effects estimated from the same data used for QTL mapping were generally overestimated. Melchinger et al. (1998) reported that estimates of the phenotypic and genetic variance explained by QTL were considerably reduced when derived from an independent validation sample as opposed to estimates from the calibration sample of the same population used to map the QTL. Brummer et al. (1997) identified QTL for soybean seed protein and oil content using eight distinct populations and reported that some QTL were sensitive to the environment in which they were initially detected. Other studies in soybean have provided mixed results regarding the validation of reported QTL (Diers et al., 1992b; Mudge et al., 1997; Li et al., 2001). In maize, results from three independent experiments repeated in the same genetic background revealed that the identified QTL were not consistent (Beavis et al., 1994; Beavis, 1994). Other maize studies have also reported inconsistency in the QTL detected across two independent samples of the same population (Ajmone-Marsan et al., 1996; Melchinger et al., 1998).
Although many studies have been conducted to identify and map QTL of important traits, very few articles describe QTL validation and their use in a marker-assisted selection project. The precise identification of QTL is necessary for successful application of MAS in plant improvement programs and for the alignment of QTL onto physical maps and use of this information to identify the gene the QTL represents (Rafalski, 2001). For practical breeding applications, the QTL data already published will be useful only if QTL can be validated in independent populations, as required for the assignment of a qualitative gene symbol (Soybean Genetics Committee, 1997). Recognizing that only one sample of the meiotic events from a population is not adequate for QTL detection and verification is an important realization needed in the scientific community. Simply demonstrating that a complex trait can be dissected into QTL and mapped to approximate genomic locations using DNA markers is inadequate (Young, 1999). Results from QTL confirmation experiments will generate new knowledge about the limitations and strengths of QTL utilization in a MAS project. Reports of QTL detection will serve the modern plant improvement programs and eventually genome projects only when these QTL have been verified.
Our data indicate that in addition to improved phenotypic data collection, larger population sizes, and multiple environments, the precise identification of QTL requires independent verification through parallel populations. The next step would be to determine the significance of confirmed QTL in different genetic backgrounds, which is essential for further utilization in breeding programs. This study provided independent confirmation for two protein QTL (cqProt-001 and cqProt-002 identified by markers E/A454-1 and UNK/A132-4, respectively), three oil QTL (cqOil-001, cqOil-002, and cqOil-003 identified by markers C1/A063-1, H/A566-2, and L/Satt398, respectively), and two seed weight QTL (cqSd wt-001 and cqSd wt-002 identified by markers G/Satt303 and E/Satt263, respectively). These QTL were validated across different environments and two independent populations and designated as confirmed (cq). We would encourage the practice of assigning cq designations to loci that have been validated. We would like to propose the QTL nomenclature used in SoyBase (http://129.186.26.94/) preceded by the cq designation for the QTL that have been validated through advanced generations or parallel populations. This would allow researchers to recognize QTL that have been mapped and then confirmed in populations derived by independent meiotic events.
| ACKNOWLEDGMENTS |
|---|
Received for publication March 24, 2003.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
B. Liu, T. Fujita, Z.-H. Yan, S. Sakamoto, D. Xu, and J. Abe QTL Mapping of Domestication-related Traits in Soybean (Glycine max) Ann. Bot., August 7, 2007; (2007) mcm149v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
B.-K. Ha, R. T. Robbins, F. Han, R. S. Hussey, J. F. Soper, and H. R. Boerma SSR Mapping and Confirmation of Soybean QTL from PI 437654 Conditioning Resistance to Reniform Nematode Crop Sci., July 30, 2007; 47(4): 1336 - 1343. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Zhu, D. R. Walker, H. R. Boerma, J. N. All, and W. A. Parrott Fine Mapping of a Major Insect Resistance QTL in Soybean and its Interaction with Minor Resistance QTLs Crop Sci., March 27, 2006; 46(3): 1094 - 1099. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. R. Panthee, V. R. Pantalone, D. R. West, A. M. Saxton, and C. E. Sams Quantitative Trait Loci for Seed Protein and Oil Concentration, and Seed Size in Soybean Crop Sci., August 26, 2005; 45(5): 2015 - 2022. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| The SCI Journals | Agronomy Journal | Vadose Zone Journal | |||
| Journal of Natural Resources and Life Sciences Education |
Soil Science Society of America Journal | ||||
| Journal of Plant Registrations | Journal of Environmental Quality |
The Plant Genome | |||