|
|
||||||||
a Division of Plant Sciences and National Center for Soybean Biotechnology, University of Missouri-Columbia, Columbia, MO 65211, USA
b USDA-ARS-MSA, 605 Airways Blvd, Jackson, TN 38301, USA
* Corresponding author (SleperD{at}missouri.edu)
| ABSTRACT |
|---|
|
|
|---|
20 cM away from each other. Different clusters may represent different loci. Reported SCN resistant QTLs were classified into three categories: suggestive, significant, and confirmed. Confirmed QTLs are credible and can be candidates for fine mapping and gene cloning. QTLs on linkage groups (LGs) G, A2, B1, E, and J were classified as confirmed. Two clusters of QTLs were identified on LG G. One of them is rhg1. One cluster of QTLs was identified near the end of LG B1, but one QTL may exist around the middle of LG B1. One cluster of QTLs was identified on LGs A2, E, and J, respectively. QTLs on LGs B2, C1, C2, D1a, D2, L, M, and N were classified into suggestive or significant. Confirmation studies are needed to lend credibility for these QTLs. A relationship between soybean QTLs and SCN races is discussed.
Abbreviations: LG, linkage group PI, plant introductions QTL, quantitative trait locus SCN soybean cyst nematode
| INTRODUCTION |
|---|
|
|
|---|
A total of 62 markerquantitative trait locus (QTL) associations have been reported by 17 papers for resistance to SCN races 1, 2, 3, 5, 6, and/or 14 in a total of 13 soybean accessions (nine resistance sources) (Concibido et al., 2004; Glover et al., 2004). Conflicting results often occurred (Concibido et al., 2004). QTLs declared by different studies show a variation for QTL location that is sometimes large. A number of false positive QTLs may have been reported because of use of low threshold values and completion of a number of genome scans (studies). Nearly 3 false positives per genome scan [µ(T) = 20 + 2 x 1.5 x 25 x 4.6 x 2.5) x 0.0032 = 2.8] (Lander and Kruglyak, 1995) are expected when threshold LOD = 2.5 is used in soybean mapping. Chances of false positive QTLs are expected to increase with more genome scans even if stringent threshold values are used in single studies.
Usually, the position of a peak (a QTL) on a region or a chromosome does not necessarily coincide with the true position of a QTL in a particular experiment and QTLs detected by different studies is not necessarily mapped at the same exact location because of sampling error even if they are in fact located on the same locus. Sampling error comes mainly from phenotype evaluation and sampling of progeny individuals. Darvasi et al. (1993), Darvasi and Soller (1997), and Roberts et al. (1999) studied sampling distribution for QTL location using computer simulations. It was demonstrated that the confidence interval of QTL location was inversely proportional to population size and QTL effect. Large variation (even covering a whole chromosome) may occur when a QTL has a small gene effect and a small population size is used. A statistical method is needed to assess whether QTLs detected on a linkage group map by different studies are located on the same locus or linked. Recently, Goffinet and Gerber (2000) developed a maximum-likelihood-based meta-analysis for QTL locations among studies. It is called meta-analysis because it is involved in analyzing results from different studies and combining information from them. It requires more than 10 to 30 reported QTLs from independent studies on the same linkage group (LG) to be valid. One simple approach for analysis of QTL location is that a LG map is divided into regions of a length and QTLs declared by different studies are classified into a cluster if they fall on the same region (Concibido et al., 2004; Becker et al., 1998). The QTLs of the same cluster may share a locus. But its disadvantages are that the length of a region is arbitrary and it does not reflect the characteristics of experiments such as population size and type. In this study, a comparative analysis of QTL location among studies was developed which is based on the confidence interval of QTL location. This method is simple in computation and it reflects the characteristics of experiments such as population size and type as well as QTL itself.
An appropriate threshold level for declaring a QTL is an important issue because of an excessive number and dependence of test statistics obtained at a series of putative positions along the genome. QTL analysis involves multiple tests and the point-wise level should be adjusted to the genome-wide level. The point-wise level is the probability that an extreme test statistics (LOD) will occur at a specific locus only by chance whereas genome-wide level is the probability that an extreme test statistics (LOD) occurs by chance somewhere in a whole genome. At this time, permutation tests (Churchill and Doerge, 1994) are a general approach for the adjustment. But other methods are also available including computer simulation (Lander and Bostein, 1989; Ooijen, 1999) and mathematical formulas (Lander and Kruglyak, 1995). Too relaxed a threshold value creates a large number of false positives, but too stringent a threshold value will slow down discovery of QTLs. To resolve this paradox, Lander and Kruglyak (1995) classified statistical evidence for markerQTL associations into four categories: (i) suggestive QTLone false positive per genome scan (genome-wide type I error = 0.63), (ii) significant QTL0.05 false positive per genome scan (genome-wide type I error = 0.05), (iii) highly significant QTL0.001 false positive per genome scan (genome-wide type I error = 0.001), and (iv) confirmed QTLsignificant QTL that has been confirmed (replicated) by another independent study. A QTL is usually declared at genome-wide type I error = 0.05 (Lander and Kruglyak, 1995; Members of the complex trait consortium, 2003). A suggestive level often gives false positive QTL but it is worth reporting if accompanied with an appropriate warning label (Lander and Kruglyak, 1995). To be credible, a QTL should be confirmed, and it would be better to confirm QTL before proceeding to fine mapping and cloning. A locus name is appropriate for a confirmed QTL but not for a suggestive QTL (Members of the Complex Trait Consortium, 2003).
When more than two studies are conducted for the same traits, meta-analysis can also be used for analyzing statistical evidence (test statistics or p value) from different studies, so that evidence from different studies, as a whole, are evaluated and the power of QTL detection may be increased (Lander and Kruglyak, 1995). Actually, meta-analysis was first suggested for analysis of statistical evidence in QTL mapping (Lander and Kruglyak, 1995), and a number of methods have been suggested or developed in human and animal QTL mapping (for example, Lander and Kruglyak, 1995; Li and Rao, 1996; Gu et al., 1998; Etzel and Guerra, 2002; Wise et al., 1999; Allison and Heo, 1998; Badner and Gershon, 2002; Belknap and Atkins, 2001). Some of them (Wise et al., 1999; Allison and Heo, 1998; Badner and Gershon, 2002; Belknap and Atkins, 2001) are applicable for experimental organisms including plant species. Wise et al. (1999) developed a non-parametric meta-analysis in which a genome is divided into different regions and these regions are ranked according to test statistics or p value and, then, a nonparametric statistical method is applied. Allison and Heo (1998), Badner and Gershon (2002), and Belknap and Atkins (2001) referred to combining p values from different studies using the fact that 2 ln(p) is distributed as
2 (df = 2) and the additive nature of independent
2 values. But the first two made adjustment of p values before combining p values but the last one did not. The goal of adjustment of p values is to control type I error. We tend to agree on no adjustment of p values before combining p values, but genome-wide adjustment after combining p values, because the adjustment after combining p values can also be used to control type I error and adjustment before combining p values will complicate meta-analysis. Allison and Heo (1998) and Badner and Gershon (2002) adopted different adjustments. The former one can be regarded as chromosome-wide adjustment but the latter one as region-wide adjustment. The key issues in use of meta-analysis for statistical evidence are heterogeneity among mapping populations and appropriate threshold. Heterogeneity among populations (for example, different SCN resistant plant introductions) often makes it complicated to interpret the results of meta-analysis. In addition, incomplete and different information reported in various studies make it difficult to conduct a meta-analysis. In this study, we did not conduct meta-analysis for statistical evidence because most of SCN resistant QTL studies used different resistance sources and test statistics or p values are available for regions with declared QTLs only. If raw datasets are available, pooled analysis (Walling et al., 2000; Li et al., 2005; Guo et al., unpublished) would be a better method for analysis of QTLs among studies.
Objectives of this study were to: (i) evaluate evidence for reported markerQTL associations for resistance to SCN in soybean and (ii) extract relatively reliable and useful information from a large number of reported markerQTL associations.
| METHODOLOGY |
|---|
|
|
|---|
3.0 (point-wise p value
0.001) were used in this study (Table 1), including QTLs detected by Glover et al. (2004), and our studies (Guo et al., unpublished; Lu et al., unpublished). It is noted that most studies used SCN populations maintained at the University of Missouri-Columbia.
|
Location of QTL
Molecular marker or position with the highest test statistics on a LG map or a region of a LG map was regarded as the estimated location of a QTL from a particular experiment.
In reported SCN-resistant QTLs, the locations of QTLs were expressed on linkage maps constructed in particular experiments. For comparisons across different studies, the reported locations of QTLs need to be projected on a known common linkage map. We projected a reported QTL on the soybean composite linkage map (Song et al., 2004) based on its relative position between its flanking markers in the original studies. A reported QTL was not projected if its flanking markers in a particular experiment was not consistent with the soybean composite linkage map for LGs or if the LG map constructed in the particular experiment was significantly different from the soybean composite linkage map for marker orders.
Confidence Intervals of QTL Location
The 95% confidence interval (CI) of a QTL location was obtained by the below formula:
![]() | [1] |
![]() | [2] |
![]() | [3] |
is the standardized phenotypic effect (expressed in residual standard deviation units) of a single allele substitution at a QTL, N the population size and R2 a proportion of the total variation explained by a QTL. The R2 provided by interval mapping, composite interval mapping or ANOVA was used for estimation of R2 in the above formulae. Here, we assumed that interval mapping and composite interval mapping provided a good estimate of R2 and ANOVA provided a reasonable estimate of R2.
The formulae [1] and [2] were first derived by Darvasi and Soller (1997) in the case of dense molecular marker linkage maps using extensive simulations. They were independently proven by Visscher and Goddard (2004) and Weller and Soller (2004) using somewhat different mathematical methodologies. The formula [3] can easily be derived from the formula described by Weller and Soller (2004) (phenotyping five plants for each line). The above formulae can also apply to a moderate marker spacing (1020 cM) (Darvasi and Soller, 1997). If an unbiased estimate of
or R2 is used, an unbiased CI will be obtained (Darvasi and Soller, 1997). Use of threshold for declaring a QTL may cause overestimation of gene effect and underestimation of CI if a QTL has a small gene effect (i.e., low detection power). However, the CI can still be obtained with approximately the correct probability of containing the true map location of the QTL (Darvasi and Soller, 1997).
If the heterogeneous region of near-isolines was not clearly defined in the original study or it was larger than the CI determined by the above formula [1], [2], or [3], the above formula [1], [2], or [3] was used for obtaining the CI of a QTL.
The CI region of one QTL on the soybean composite linkage map was determined through placing the center of its estimated CI on its location. If one side of the QTL was beyond the end of a LG, the CI was cut off from the end of the LG.
Meta-Analysis of Reported MarkerQTL Associations
QTLs for different races or different studies were classified into one cluster if their estimated 95% CI regions had a region in common. QTLs from the same cluster may have a shared locus. To exclude or confirm that QTLs from the same cluster are closely linked genes, fine mapping is needed. QTLs for different races or different studies were classified into different clusters if their CI regions had no region in common and were
20 cM away from each other. Different clusters may represent different loci. Additional studies are needed if the CI regions were close but did not overlap. QTL detected in a particular experiment was excluded if its CI region covered a whole chromosome.
Classification of Statistical Evidence of QTLs
Thresholds for Declaring QTLs
In previous soybean SCN-resistant QTL mapping studies, the following threshold levels were used for declaring QTL: (i) LOD = 2.5 (equivalently p = 0.003) (Yue et al., 2001a, 2001b), (kii) p = 0.002 (Concibido et al., 1994, 1996, 1997), and (iii) LOD = 3 (equivalently p = 0.001) (Webb et al., 1995; Heer et al., 1998; Qiu et al., 1999; Wang et al., 2001, Meksem et al., 2001). Few permutation tests were used to determine threshold levels (Glover et al., 2004). We used three methods to determine threshold value for declaring a QTL at the suggestive level (genome-wide type I error = 0.63) and at the significant level (genome-wide type I error = 0.05) for soybean F2 mapping populations (used in the majority of studies). We obtained threshold LOD = 2.9 at the suggestive level and 4.2 at the significant level using Ooijen's (1999) computer simulation tables. Threshold LOD was 3.0 at the suggestive level and 4.5 at the significant level using Lander and Kruglyak's (1995) formula. Threshold LOD was 3.7 to 4.0 for different races at the significance level using permutation tests (Churchill and Doerge, 1994) based on our two mapping populations (1000 permutation tests each race for each population) (Guo et al., unpublished). In summary, threshold LOD of 3.0 is approximate to genome-wide type I error = 0.63 (suggestive level) and threshold LOD of 4.0 to genome-wide type I error = 0.05 (the significant level) in soybean. A QTL is usually declared at genome-wide type I error = 0.05. Suggestive level often gives false positive QTL, but it should be good evidence if accompanied with other evidence. Therefore, a suggestive QTL was declared at LOD
3.0 and a significant QTL at LOD
4.0 in this study.
Classification of QTLs
With reference to Lander and Kruglyak (1995), we classified soybean SCN-resistant QTLs into three categories: (i) suggestive QTL: LOD
3.0 (p value
0.001), (ii) significant QTL: LOD
4.0 (p value
0.0001), and (iii) confirmed (replicated) QTL. A confirmed QTL is defined by Lander and Kruglyak (1995) as being a significant QTL from one study that has subsequently been confirmed by a second study. Confirmation of a QTL includes two stages. The first stage is involved in searching for a QTL, usually, on a whole genome using one mapping population sample. The second stage just focuses on QTL detection using another mapping population sample on the QTL candidate region (typically, 20 cM) that has been established in the previous study. The second stage can be accomplished using near-isogenic lines, independent crosses, and breeding selection (Members of the complex trait consortium, 2003). Typical examples of confirmed SCN resistance QTLs are Meksem et al. (2001), Glover et al. (2004) and Wang et al. (2001). The first two studies used near-isogenic lines and the last one an independent cross in the second stage. The above definition of confirmed QTL was extended in this study to include the two following situations.
| RESULTS |
|---|
|
|
|---|
|
|
One cluster of QTLs were identified near the I locus on LG A2 in soybean PI 90763, PI 437654, Peking (including Forrest), PI 404198A, J87233, and M851430 (Table 2 and Fig. 1). The CI regions of the first five soybean lines were 4 to 29 cM. The CI region of the last one was wide (64 cM). The CI regions have a 2-cM region in common. Rhg4 has been mapped close to molecular marker Satt632 and the I locus (Cregan et al., 1999b; Meksem et al., 2001). Satt632 and the I locus were within the CIs of QTLs identified in PI 90763, PI 437654, Peking, Forrest, PI 404198A, J87233, and M851430 (Fig. 1). It is concluded that these PIs may carry Rhg4. It must be noted that Rhg 4 may be detected in one study, but it might not be detected in others, although a high detection power is expected because of its large effect. For example, it was detected in soybean breeding line M851430 (Concibido et al., 1994) but not in PI 209332 from which M851430 was derived (Concibido et al., 1996). One possible explanation is that the QTL on LG A2 may be modified by other genes.
One cluster of QTLs was identified near one end of LG B1 in soybean PI 90763, PI 404198A, and PI 438489B (Table 2 and Fig. 1). The CI regions have a 2 cM region in common. Another QTL was identified near the middle of LG B1 in soybean PI 89772 (Fig. 1). Its CI region was close to that of soybean PI 438489B but
20 cM away from those of soybean PI 90763 and PI 404198A. There are two possible explanations for the QTL on LG B1 in PI 89772. One is that the QTL peak obtained in PI 89772 is a local peak because of sampling error not a global peak on LG B1 because the end region (from Satt359 to Satt451) of LG B1 was not searched for a QTL (Yue et al., 2001b). The other one is that one QTL truly exists near the middle of LG B1. Additional studies are needed to resolve this paradox.
One cluster of QTLs was identified near the middle of LG E in soybean PI 90763, PI 468916, PI 438489B, and PI 467312 (Table 2 and Fig. 1). Their CI regions had a 5-cM region in common (the CI region of PI 467312 was undetermined). One QTL on LG E was identified from the original study in soybean PI 89772 but the distance between its flanking markers, A135 and Satt231, in the original study (Yue et al., 2001b) was significantly different from that on the soybean composite linkage map. Because of this, it was not shown in Fig. 1. Further study is needed.
One cluster of QTLs was identified near the end of LG J in soybean PI 90763, PI 209332 (including M851430), Bell (PI 88788), and PI 467312 (Table 2 and Fig. 1). Their CI regions had a 9 cM region in common (the confidence region of PI 467312 was undetermined). The QTL in Bell has been confirmed and designated as cqSCN-003 (Glover et al., 2004). It is concluded that these PIs may carry cqSCN-003.
| DISCUSSION |
|---|
|
|
|---|
Specific Association of QTLs with SCN Races
QTLs for resistance to different races fall on the same regions on LGs G, A2, B1, E or J (Fig. 1). QTLs for resistance to different races were regarded as the same if they fell on the same region. Data in Table 3 summarizes the relationship of soybean QTLs with resistance to SCN populations maintained at the University of Missouri-Columbia (which are designated as races 1, 2, 3, 5, and 14). These races are believed to be near-homogeneous because of reproduction in a small population size for more than thirty generations (Arelli et al., 1997, 2000). QTL on LG G (rhg1) is associated with resistance to races 1, 2, 3, and 5 in all the involved PIs (Table 3). But, it may be less frequently associated with resistance to race 14 (it might have a small effect on race 14 and races more virulent to rhg1 might exist). QTL on LG A2 (Rhg4) is frequently associated with resistance to race 3; however, it is less frequently associated with resistance to races 2, 5, and 14. In contrast to QTL on LG A2, QTL on LG B1 is frequently associated with resistance to races 2 and 5, but it is less frequently associated with resistance to race 3. QTLs on LGs E and J are frequently associated with resistance to races 14 and 3. LG E may be less frequently associated with resistance to race 1, 2, and 5. LG J is less frequently associated with resistance to race 5. In conclusion, there seems to be a specific relationship between soybean QTLs and SCN populations.
|
| ACKNOWLEDGMENTS |
|---|
Received for publication April 27, 2005.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
S. Liu, M. D. Hall, C. A. Griffey, and A. L. McKendry Meta-Analysis of QTL Associated with Fusarium Head Blight Resistance in Wheat Crop Sci., October 22, 2009; 49(6): 1955 - 1968. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| The SCI Journals | Agronomy Journal | Vadose Zone Journal | |||
| Journal of Natural Resources and Life Sciences Education |
Soil Science Society of America Journal | ||||
| Journal of Plant Registrations | Journal of Environmental Quality |
The Plant Genome | |||