Published online 18 December 2007
Published in Crop Sci 47:S-60-S-71 (2007)
© 2007 Crop Science Society of America
677 S. Segoe Rd., Madison, WI 53711 USA
Bridging Genomics and Genetic Diversity: Linkage Disequilibrium Structure and Association Mapping in Maize and Other Cereals
Jean-Baptiste Veyrierasa,
Letizia Camus-Kulandaivelua,
Brigitte Gouesnardb,
Domenica Manicaccia and
Alain Charcosseta,*
a UMR 8120 Génétique Végétale, INRA UPS INA-PG CNRS, Ferme du Moulon, 91190 Gif sur Yvette, France
b UMR 1097 Diversité et Génomes des Plantes Cultivées, INRA Domaine de Melgueil, 34130 Mauguio, France
* Corresponding author (charcos{at}moulon.inra.fr).
 |
ABSTRACT
|
|---|
Linkage disequilibrium (LD) and association mapping is receiving considerable attention in the plant genetics community for its potential to use existing genetic resources collections to fine map quantitative trait loci (QTL), validate candidate genes, and identify alleles of interest. Based on investigations in maize (Zea mays L.) either published or recently conducted in our group, we discuss three elements of particular importance for conducting association mapping or interpreting the results: (i) the analysis of population structure into subgroups, (ii) its use to control for spurious associations and consequences in the specific case of differential selection among subgroups, and (iii) the analysis of the local structure of LD into haplotypes and its consequences on the resolution and the application of LD mapping. Consequences and perspectives for plant breeding are briefly discussed.
Abbreviations: HMM, Hidden Markov Model LD, linkage disequilibrium PCA, principal component analysis QTL, quantitative trait loci SSR, simple sequence repeat
Received for publication April 7, 2007.
Bridging Genomics and Genetic Diversity: Linkage Disequilibrium Structure and Association Mapping in Maize and Other Cereals
Jean-Baptiste Veyrierasa,
Letizia Camus-Kulandaivelua,
Brigitte Gouesnardb,
Domenica Manicaccia and
Alain Charcosseta,*
a UMR 8120 Génétique Végétale, INRA UPS INA-PG CNRS, Ferme du Moulon, 91190 Gif sur Yvette, France
b UMR 1097 Diversité et Génomes des Plantes Cultivées, INRA Domaine de Melgueil, 34130 Mauguio, France
* Corresponding author (charcos{at}moulon.inra.fr).
Linkage disequilibrium (LD) and association mapping is receiving considerable attention in the plant genetics community for its potential to use existing genetic resources collections to fine map quantitative trait loci (QTL), validate candidate genes, and identify alleles of interest. Based on investigations in maize (Zea mays L.) either published or recently conducted in our group, we discuss three elements of particular importance for conducting association mapping or interpreting the results: (i) the analysis of population structure into subgroups, (ii) its use to control for spurious associations and consequences in the specific case of differential selection among subgroups, and (iii) the analysis of the local structure of LD into haplotypes and its consequences on the resolution and the application of LD mapping. Consequences and perspectives for plant breeding are briefly discussed.
Abbreviations: HMM, Hidden Markov Model LD, linkage disequilibrium PCA, principal component analysis QTL, quantitative trait loci SSR, simple sequence repeat
 |
INTRODUCTION
|
|---|
Thanks to molecular markers, much progress has been made in understanding the organization of genetic diversity within collections of genetic resources. Also, major advances have been made in identifying regions of the genome (quantitative trait loci [QTL] in case of quantitative traits) that contribute to the variation of traits of agronomical interest. Until recently, QTL mapping has generally been performed in specific controlled mapping populations, involving a limited number of parents. Recent statistical developments now make it possible to use much more diversified materials, genetic resources collections, and breeding material, for mapping regions involved in trait variation. First, association genetics, the analysis of the relationship between gene polymorphism and trait variation in collections representing a broad diversity with no or limited relatedness between accessions, is now widely applied in plants following the pioneer work of Thornsberry et al. (2001) (for reviews see Gupta et al., 2005; Yu and Buckler, 2006). This approach has been made possible to a large extent by the development of new statistical methods to analyze population structure. Population structure can indeed contribute to linkage disequilibrium (LD) (i.e., a statistical association) between polymorphisms (Flint-Garcia et al., 2003), even if the corresponding loci are not linked physically. So taking population structure into account in the tests is important to avoid erroneously concluding the causality of a gene or genomic region. Second, new QTL mapping methods have been developed to conduct QTL mapping in breeding populations with complex pedigree structure (Crepieux et al., 2004; Parisseaux and Bernardo, 2004; Zhang et al., 2005). Among these recent methods, that of Yu et al. (2006) deals with both ancestral population structure and possible complex pedigree relationship among individuals.
Population structure and/or relatedness being assumed dealt with appropriately, the mapping accuracy of these approaches is determined by the local magnitude of LD (Rafalski, 2002). Relationship between LD and physical distance has proven highly variable among a range of species, type of materials, and also genes within species (see for reviews Flint-Garcia et al., 2003; Rafalski and Morgante, 2004; Yu and Buckler, 2006). This determines the precision of the approach. A slow decay (LD holding over hundreds of kilobases) will be favorable for LD mapping with an appropriate density of markers that do not necessarily contain polymorphisms in the causal gene. On the other hand resolution will not be high enough to discriminate between closely linked polymorphisms. A fast decay of LD over very short distances (a few hundred base pairs), will be favorable to validate a candidate gene and identify the causal region within the gene. On the other hand, a very high density of markers needs to be used if no candidate gene is available. Information on the local magnitude of LD is therefore of key importance to optimize experiments, either for a genome-wide approach with no a priori information or for the analysis of specific regions determined from existing information on QTL mapping. Note that the identification of such regions can be facilitated to a large extent thanks to meta-analysis of QTL, which helps in determining regions of the genome repetitively involved in trait variation (Chardon et al., 2004) and may narrow down the confidence intervals of QTL just by taking full advantage of existing information (Veyrieras et al., 2007). In case of low local LD, relevant choice of candidate polymorphism is of course a key factor for the success of the approach. In addition to well-established considerations regarding the choice of candidate genes (Pflieger et al., 2001), it should be noted that very first results of map-based cloning in plants (see Price, 2006; Salvi and Tuberosa, 2005) show that causal factors may in some cases lie at more than several tens of kilobases from the nearest gene as illustrated by genes tb1 (Clark et al., 2006) and vgt1 (Salvi et al., 2007). Such intergenic regions in the vicinity of candidate genes may therefore deserve consideration as candidate polymorphism for association and LD mapping.
We will discuss in this survey three topics of key importance regarding association genetics and LD mapping: (i) the analysis of population structure, (ii) the relationship between population structure and trait variation and its consequences for testing the effect of candidate polymorphisms, and (iii) the local magnitude of LD and perspectives for modeling it. These aspects will be mostly discussed considering results on maize (Zea mays L.) adaptation to temperate climate as a case study. Finally consequences for marker-assisted breeding will be briefly presented.
 |
Understanding Population Structure
|
|---|
Understanding the organization of genetic diversity within agricultural species has long been a matter of interest to identify original genetic resources to be used as parents of breeding programs, to define heterotic groups for hybrid breeding, etc. Molecular markers have proven particularly helpful to address these issues. Until recently, analysis of molecular marker data has been approached by hierarchical structuring, using diverse algorithms such as UPGMA, Ward, and factorial analysis, principal component analysis (PCA), and principal coordinate analysis. Hierarchical clustering and factorial analyses lead to complementary information: discrete classification into groups and more quantitative positioning of the accessions, respectively. Advent of quantitative (probabilistic) clustering first achieved with STRUCTURE software (http://pritch.bsd.uchicago.edu/software.html) using a Bayesian approach has been a real breakthrough to combine these two kinds of information in an unified analysis of population structure (Pritchard et al., 2000). As a consequence, this software has been used in numerous publications in plants. Note that more recently Tang et al. (2005) and Wu et al. (2006) developed similar maximum likelihood methods but, contrary to STRUCTURE, inside the Frequentist framework (both based on an EM algorithm). Nevertheless, the advantage of these population model-based clustering has a price: they still raise a major issue regarding the choice of the optimal number of clusters (although in the case of STRUCTURE the ad hoc method of Evanno et al. [2005] seems to give interesting results).
Population structure is obviously the consequence of multiple events, starting from the modalities of domestication, either unique as presently established in the case of maize (Matsuoka et al., 2002) or multiple as in the case of rice (Oryza sativa L.) (see Garris et al., 2005, for an investigation of populations structure in rice). In the case of maize, one of the most striking features of genetic diversity organization is certainly the exceptional divergence of the "Northern Flint" group, clearly underlined by Doebley et al. (1986) using isozyme markers This divergence is illustrated by axis 1 in the PCA of 24 simple sequence repeat (SSR) markers on a large set of 275 open pollinated varieties of American and European origins (Fig. 1
, based on SSR data of Dubreuil et al., 2006). This group gathers materials with very specific morphological characteristics such as essentially "flint" (vitreous) kernels and cylindrical ears with 8 to 10 rows of kernels. It is also characterized by early flowering and photoperiod insensitivity. This makes it well adapted to relatively cool temperate climates such as northeastern America where its cultivation is attested since AD 800, after which a rapid and drastic modification of agricultural practices followed (Smith, 1989). It is also well adapted to northeastern Europe, where it was introduced before 1539, and also southern Chile, where its introduction remains to be further analyzed.

View larger version (22K):
[in this window]
[in a new window]
|
Figure 1. Structure among all accessions as revealed by principal component analysis (PCA) on the among open pollinated varieties variance–covariance matrix of allele frequencies (data from Dubreuil et al., 2006). Axis 1 can be interpreted as the opposition between Northern Flint and tropical materials. Axis 2 can be interpreted as the opposition between the non-Northern Flint origins of northern American and European materials, respectively. Northern American populations were identified according to racial classification (ASW, American southwestern; ASD, American southern Dent; ACB, American Corn Belt; ANF, American Northern Flint). Other American origins were identified according to the country of origin: Arg, Argentina; Bol, Bolivia; Chl, Chili; Cos, Costa Rica; Cub, Cuba; Dom, Dominican Republic; Ecu, Ecuador; Gua, Guatemala; Mex, Mexico; Pan, Panama; Per, Peru; Uru, Uruguay; Ven, Venezuela; Win, West Indies. European origins were identified according to the country or region of origin (Als, Alsace; Bul, Bulgaria; Cze, former Czechoslovakia; Fra, central France; Gal, Galicia; Ger, Germany; Ita, Italy; Pol, Poland; Pyr, Pyrenean; SpS, southern Spain; Ukr, Ukraine; Yug, former Yugoslavia).
|
|
Northern Flint maize has also been involved in hybridizations with tropical groups, yielding new varieties (Corn Belt Dent) used by settlers in North America (Anderson and Brown, 1952; Doebley et al., 1988). It also has contributed to the development of varieties specific to intermediate latitudes of Europe, through hybridization with different groups of tropical origin(s) (Rebourg et al., 2003). The signature of these past hybridizations can clearly be viewed by the intermediate position of the corresponding groups on axis one of PCA (Fig. 1). Considering the same dataset using STRUCTURE software with "correlated frequency mode" (i.e., a common ancestral population from which groups diverged recently), Camus-Kulandaivelu et al. (2006) clearly concluded that there were seven ancestral groups (Fig. 2
). It can be noted that this analysis identifies as ancestral groups Northern Flint and three tropical groups that can be interpreted as Mexican, Caribbean, and Andean, respectively. Ancestral groups identified by STRUCTURE also include European, Italian, and Corn Belt groups which are known to have been formed by past hybridizations between Northern Flint and materials from different tropical origins. The identification of these groups as "ancestral" by this approach indicates that the number of generations (several hundreds) involved since their formation has led to an accumulation of recombination events, so that they show almost no LD among markers and these past admixtures no longer can be detected by the approach. This illustrates that, despite the appealing admixture population model implemented in STRUCTURE, still performing a standard statistical method like PCA can be justified to further interpret the data. Recent elements on the parallel between PCA and cluster analysis can be found in Patterson et al. (2006). They show that applying PCA and testing the significance of the first axes may help to determine the number of groups underlying the population structure (number of significant axes + 1). This may also prove helpful for identifying situations where STRUCTURE fails to identify more than one group, whereas PCA suggests significant LD due to an underlying structure, which needs to be taken into account in association tests (see next section).

View larger version (47K):
[in this window]
[in a new window]
|
Figure 2. Models for population structure at three steps of maize selection history: open pollinated varieties, also called landraces (seven groups), first cycle inbreds (five groups), and whole inbred panel (five groups). Groups for each panel are represented by colors as indicated at the bottom of the figure. For the inbred panels, each inbred line is represented by a vertical line divided into colored segments, the length of which indicates the proportion of the genome attributed to the different groups. For the landrace panel, each population is represented by the mean proportions estimated for the five inbred lines simulated to represent it. Plain arrows stand for filiation relationship between clusters and have been established on the basis of either STRUCTURE assignments (in the joint study of landraces and first cycle inbreds) or genetic distances between groups of inbred line panels. Dotted arrows indicate lower contributions (less than three inbred lines with a high genome proportion (>0.80) attributed to a group obtained in the STRUCTURE joint analysis of landraces and first cycle inbred lines). From (Camus-Kulandaivelu et al., 2006).
|
|
Population structure has been further shaped by modern breeding. Hybrid breeding programs started by creating inbred lines by repetitive selfing of plants issued from open pollinated varieties (further referred to as first cycle lines). Figure 2 shows STRUCTURE results for a panel of 153 first cycle lines, compared to that determined for open pollinated varieties. As expected, both are highly consistent, despite some merging of groups, probably due to uneven sampling. These lines have then been used in turn as parents to create new lines using a variety of methods (Hallauer, 1990), among which is backcross, or pedigree breeding, with the possible intermediate of recurrent selection. When analyzing a total panel of 375 inbred lines including all lines from the former panel and 222 lines from such more-advanced generations, we have found that it was much more difficult to unambiguously conclude a number of ancestral groups than it was for open pollinated varieties and first cycle inbred lines alone. First, the ad hoc pseudo deviance statistics recommended by STRUCTURE authors to determine the number of groups shows no clear stabilization. Second, we found that the nature of the groups, as determined from the pedigrees of lines with a high attribution (>80%) to a group, varied among outputs for a same number of groups. This instability for the total inbred lines panel was confirmed by a distance-based comparison of outputs using the expectations of allele presence for all individuals (Camus-Kulandaivelu et al., 2007). For the first inbred lines panel, this approach clearly showed clusters of outputs with a same group number, whereas the total inbred line panel led to a very complex pattern.
We thus determined in this situation the output based on an empirical compromise between minimal group number and maximum likelihood of the data. This led to five groups, which appeared consistent with the organization found for populations and first cycle inbreds, with the additional individualization of a Stiff Stalk Group within the Dent group. This group gathers lines issued from the famous Iowa Stiff Stalk Synthetic conducted over numerous cycles of recurrent selection. Note that this corresponds to the famous Stiff Stalk vs. non–Stiff Stalk heterotic pattern still prevailing for hybrid breeding in the U.S. Corn Belt and regions with comparable climatic conditions. This heterotic pattern is therefore the result of modern hybrid breeding, contrary to the Northern European Flint/Dent heterotic pattern which preexisted hybrid breeding, as illustrated by Fig. 1 and 2. Runs of STRUCTURE for more than five groups conducted to the individualization of small groups of highly related lines, often related to a same major progenitor (such as B73, F2, F7, etc.). Their relatively similar contributions make it that many, possibly very different groupings can fit very closely to the data, making it intrinsically extremely difficult to use the approach, which was not devised to handle such situations.
Note that results obtained using the 375 inbred line panel appear consistent with other studies but also show a clear effect of material sampling. In the first association panel used by Thornsberry et al. (2001) (included in our own panel), and in new panels increased in size up to 302 lines (Flint-Garcia et al., 2005; Liu et al., 2003), Northern Flint and European Flint materials were represented to a limited extent, so that these studies concluded with three main groups: subtropical (consistent with our tropical group), Stiff Stalk material (consistent with ours), and non–Stiff Stalk material, with then additional small groups of sweetcorn and popcorn lines. On the other hand, a Flint group differentiated from Dent groups was also identified by Andersen et al. (2005), when analyzing inbred lines used in European agriculture.
Testing Polymorphism Effect in Presence of Population Structure
When testing if a candidate polymorphism contributes to trait variation in a collection of diverse materials, it is necessary to investigate first the possible global effect of population structure. By construction, groups underlying population structure show contrasted allele frequencies for at least some of the markers used in the structure analysis. This differentiation can be quantified by usual population genetics parameters (see below). Groups also can be differentiated for traits of agronomical importance, either because of drift during their independent evolution or possibly because of differential selection. This differentiation can be checked by regression of performance on contribution of groups to the genome of individuals following Model 1:
 | [1] |
Where Tj stands for the trait value of genotype j, ao for the intercept, gij for the proportion of genotype j genome attributed to group i (k groups in total), ai for the fixed effect of group I, and ej for the residual.
Application of Model 1 is illustrated in Table 1
for flowering time and kernel traits evaluated on the total inbred panel of 375 lines analyzed by Camus-Kulandaivelu et al. (2006). The magnitude of variation explained by population structure depends on the trait of interest, from 4 to 6% for kernel composition traits to 51% for flowering time. This highest relationship for male flowering time is consistent with results of Flint-Garcia et al. (2005) on a different panel (35%). As expected from climate characteristics in regions of origins, Northern Flints and European Flints exhibit the earliest flowering time, whereas tropical material exhibit the latest flowering time. Besides early flowering, Northern Flint and European Flint material is also characterized by a high vitreousness of endosperm and smaller kernels, so that these traits show strong associations with population structure (31 and 27%, respectively). Similar relationships between populations structure and adaptive traits have been reported in rice (Semon et al., 2005).
Such relationships between population structure and trait variation can cause indirect associations between neutral polymorphism and the variation of a trait of interest. Indeed, in extreme situations, any neutral polymorphism specific to a group that displays an extreme value for the trait of interest will be significantly associated with the trait, if tested through classical models. This possible confounding effect of population structure has been well recognized in human genetics and also in the very first studies conducted in plant genetics (Thornsberry et al., 2001). An appropriate test of the association must take it into account as
 | [2] |
Where
j indicates the presence vs. absence of the allele of interest for individual j (in the case of two alleles and homozygous individuals), more generally
j is the dose of allele for individual j, b is the fixed effect associated. Note that alternative recent approaches can be considered to deal with the consequences of population structure, such as using the first most significant PCA axes as covariates (Price et al., 2006). This could be of particular interest in situations where STRUCTURE may fail to reveal more than one group but PCA suggests significant population structure (Patterson et al., 2006). Although a priori less relevant because of their discrete nature, qualitative classifications obtained by means of hierarchical clustering (or possibly a priori information such as geographical origins) may also be considered by using class assignation (0 vs. 1) as fixed covariate effects. In any case, to prevent possible residual effects of population structure not accounted for by the covariates, adjustment of test statistics using genomic control should be recommended, that is, a rescaling of P values by the distribution of test statistics observed for a supplemental set of neutral markers (Zhao et al., 2007). In case several polymorphism x trait combination are tested, risks should be adjusted using a False Discovery Rate approach (Storey and Tibshirani, 2003). Note that more recently, several authors proposed to use Bayesian regression and Bayes Factor quantities to deal with multiple test issue (Balding, 2006).
Such tests have been performed in a variety of crops and genes. They concluded there were a number of positive associations in maize (Palaisa et al., 2003; Szalma et al., 2005; Whitt et al., 2002; Wilson et al., 2004; see Yu and Buckler, 2006 for a review). The first association reported in plants, between flowering time and polymorphisms in the D8 gene (Thornsberry et al., 2001), is particularly interesting to illustrate since it now has been addressed in at least two additional studies, that of Andersen et al. (2005) on a set of 71 inbred lines and that of Camus-Kulandaivelu et al. (2006) on the three panels of materials presented above. All studies concluded that there were highly significant effects when not correcting for population structure. When correcting by population structure, no more effect on flowering time was detected for any polymorphism in Andersen et al. (2005) and the first cycle inbred line panel of Camus-Kulandaivelu et al. (2006). The 6-bp indel at position 3472 tested by Camus-Kulandaivelu et al. (2006) remained significant, although not highly, in the total inbred line panel presented above. It remained highly significant on the panel of landraces. It was observed by Andersen et al. (2005) and Camus-Kulandaivelu et al. (2006) that the deletion allele of the 6-bp indel polymorphism associated with early flowering time in the study of Thornsberry et al. (2001) was highly specific of the Northern Flint group and derived materials. A possible interpretation for the lack of significance of some results is therefore that selection of this early allele in the Northern Flint group, due to selection for local adaptation, has led to a high colinearity between D8 and population structure, masking polymorphism effect when using Model 2. Similar situations are likely to occur in case of adaptive phenotypic traits. They should be further investigated using a suitable genetic material such as inbred lines or families within open pollinated varieties with balanced frequencies for the polymorphism(s) of interest.
Also, to go beyond this empirical interpretation, we quantified the differentiation of this polymorphism among groups (Camus-Kulandaivelu et al., 2006). Frequency of allelic forms were evaluated using a logistic regression model for inbred lines and regression model for populations. This confirmed that the highest frequencies of D8 deletion were observed in the most early groups (frequencies of 0.82, 0.56, 0.09, 0.38, 0.02 in the Northern Flint, European Flint, Corn Belt Dent, Stiff Stalk, and Tropical groups of the total inbred panel, respectively). These frequencies were in turn used to estimate the relative differentiation among groups (Gst). It appeared that the Gst for D8 polymorphism was higher than that of any SSR marker used to analyze population structure. It reached a maximum of 0.468 for the first-cycle inbred line panel whereas the maximum value reached for 67 SSR loci was 0.194 (note that the theoretical expectations of these values do not depend on mutation rate). Following a classical population genetics reasoning (Beaumont and Nichols, 1996), this strongly suggests that D8 has been involved in differential selection for adaptation. Besides D8 case, this illustrates that Model 2, which aims at preventing the detection of false positives due to population structure may on the other hand mask true associations in case of differential selection between groups. Analysis of relative differentiation appears to be a complementary way to the interpretation of results.
Finally, it has to be noted, that beyond the effect of population structure, association tests may also be affected by the relatedness of individuals. A simple and extreme case for this is when one line has been used as a major progenitor, the progeny of which is represented by several lines within the panel of interest. These lines likely share some relatively rare alleles, which therefore become associated with the phenotypic characteristics inherited from the common progenitor. In other terms, pedigree relatedness may cause some LD between rare alleles specific to major progenitors, which may explain within group LD between unlinked polymorphisms found in some studies (Stich et al., 2005, 2006). Addressing such situations with subpopulation models should generally prove difficult. First, determining populations structure in such situations raises problems regarding the underlying hypotheses of software STRUCTURE (see above). Second, addressing association tests with numerous subgroups corresponding to major progenitors as covariates with fixed effects may lead to an over-parametrization of the models. An alternative solution in this case is to use a mixed model approach, proposed by Yu et al. (2006). Such a model includes fixed subpopulation effects as covariates, a fixed effect of the locus of interest, and a random polygenic effect that takes into account the genetic covariance between individuals due to relatedness, similar to what is classically performed in Best Linear Unbiased Prediction in animal breeding (Henderson, 1975). This model is equivalent to Model 2 when the random polygenic component is omitted. Note that, the estimation of relatedness can be based on a sample of markers, following the approach of Ritland (1996). This facilitates the application of the method to a large extent, compared to the complex management of possibly unreliable pedigree relationships. Efficiency of mixed models in terms of power vs. type risk has been assessed by Yu et al. (2006), and they should rapidly generalize. The method has for instance been tested very recently in Arabidopsis, with a comparison of different covariates (STRUCTURE results or PCA) and different methods for estimating relatedness between individuals from marker data (Zhao et al., 2007).
 |
Local Structure of LD and Resolution of Association Mapping
|
|---|
Supposing global effect of population structure and relatedness to be addressed appropriately, the question that comes next is the resolution of association studies. Contrary to usual mapping populations, generally created in a few generations, populations considered for association genetics result from long and complex evolutionary processes. Generations accumulated have given opportunities for numerous recombination events, reshuffling the ancestral chromosome segments in small pieces, which should lead to a high resolution. A widely used approach to evaluate this has been to use pairwise LD statistics (such as R2) to evaluate the relationship between LD and physical distance.
Empirical surveys of LD pattern published in maize have shown that intragenic LD generally decreases rapidly with physical distance (in general no longer significant after 1 kb, see Tenaillon et al., 2001; Remington et al., 2001). However, these results have pointed out a variation among genes; there is also a tendency toward longer LD span when going from ancestral toward elite material (Ching et al., 2002). Knowledge of LD over broader physical distances is still limited in maize. Long-range LD has been reported such as between Tb1 and D8 genes, which are more than 100 kb apart (Remington et al., 2001), between the Y1 gene and neighboring genes over 600 kb in one direction (Palaisa et al., 2004), and around the adh1 gene over 500 kb (Jung et al., 2004). Also, investigation of diversity in the vicinity of the domestication gene Tb1 has underlined a selective sweep up to 60 to 90 kb away from the 5' end of the gene (Clark et al., 2004). In maize, due to the allogamous reproductive mode, these situations may be specific of regions submitted to strong selection pressure or elite material with a relatively narrow genetic basis. Such long-range LD appear to be frequent in autogamous cereals such as wheat (Breseghello and Sorrells, 2006) and barley (Hordeum vulgare L.) (Kraakman et al., 2004), which made it possible in these cases to run LD mapping studies. As for maize, LD has been reported to decrease to a large extent in ancestral barley populations (Caldwell et al., 2006). Finally it can be noted that medium range LD (several to 15 kb) was reported in sorghum [Sorghum bicolor (L.) Moench] (Hamblin et al., 2005).
Whatever the scale addressed, one major limitation of usual LD analyses is summarizing multilocus information in a series of pairwise locus statistics. Recently, new methods have been developed to model globally multilocus diversity in terms of haplotype segments and underlying evolutionary parameters such as recombination and mutation rate (see for instance Li and Stephens, 2003). The aim of such haplotype-based models is twofold: (i) to provide a more readable and comprehensive view of LD patterns related to evolutionary parameters of interest, and (ii) to approximate the probability of haplotypes given the markers, providing an efficient statistical and predictive framework to investigate the relationship between genetic diversity and phenotypic variation.
Haplotype-based approaches have been intensively developed since the pioneering work of McPeek and Strahs (1999) in human case–control studies. There are presently two main philosophies for tackling the issue raised by modeling haplotype diversity. First, several authors have developed methods that aim at reconstructing consistent genealogies by using either stochastic algorithms (see for instance Fearnhead and Donnelly, 2001; Larribe and Lessard, 2002; Zollner and Pritchard, 2005) or dynamic algorithms (Song and Hein, 2005). On the other hand, by assuming that the recombination process can be approximated by a Markovian process, Hidden Markov Model (HMM)-based procedures have been devised (McPeek and Strahs, 1999; Liu et al., 2001; Morris et al., 2002; Li and Stephens, 2003). This latter modeling has the advantage of being more computationally tractable than the former procedure (Li and Stephens, 2003). Although these haplotype modeling methods generally rely on Wright–Fisher population models, which are not realistic for many plant species, they seem to be relatively robust to departure from the model assumptions.
In this context, the haplotype block HMM devised by Greenspan and Geiger (2004), provides an interesting statistical framework for mining LD patterns. In their model, the haplotype diversity is partitioned into blocks in which the observed haplotypes are assumed to derive only by mutation from a few number of "ancestral" haplotypes. These ancestral haplotyes represent the few haplotypes that have passed through a bottleneck. Then, to take into account the variation of factors such as genetic drift and selection along the genome, the model allows the number of ancestral haplotypes to vary between blocks. For most cultivated plants, the joint effect of human domestication and/or recent intensive breeding programs can be interpreted as successive bottlenecks, the intensity and duration of which vary both in time and along the genome. Locally, one can reasonably assume that only a few ancestral haplotypes have contributed to forge the current haplotype diversity. In other words, current individual haplotypes can be viewed as a mosaic of a few ancestral haplotypes that have been reshuffled in smaller pieces over generations. Nevertheless, the assumption of a haplotype block structure can potentially be a limitation to the use of this model in plants. If empirical haplotype block structure has been reported in human, there is presently no strong evidence in plants for such a particular LD pattern, and it is worth noting that this phenomenological view of genetic diversity is still a controversial issue in human genetics.
As discussed by Li and Stephens (2003), the block assumption can be relaxed by adopting a marker per marker HMM process in which recombination rate heterogeneity along the sequence is properly parametrized into the model. In Fig. 3
, we present the results of analyses of three maize candidate genes, Dwarf3 (D3), Id1, and Dwarf8 (D8), using a haplotype modeling (Veyrieras, 2006), close to the one of Greenspan and Geiger (2004) but which relaxes the block partition assumption using the HMM parametrization idea of Li and Stephens (2003). Note that the LD structure of these genes has been previously studied by Remington et al. (2001). If four ancestral haplotypes seem to best capture the haplotypic diversity for Id1 and D3, three are enough for D8. It is worth noting that applying the same algorithm to the Tb1 gene led to a unique ancestral haplotype. This may be due to the higher human artificial selection on Tb1 during the domestication process of maize than for the three other genes.

View larger version (79K):
[in this window]
[in a new window]
|
Figure 3. Result of the analysis of the genes D3, D8, and Id1 (data from Remington et al., 2001) using our Hidden Markov Model (HMM)-based algorithm. The haplotypes are sorted according to their leaf position in a neighbor-joining tree based on their euclidean distance matrix. For each gene the left part of the figure depicts the pattern of mutation in the raw data set and the right part the ancestral origins of marker along the gene. At the top of each figure, the inferred ancestral haplotypes are displayed (i.e., four for D3 and Id1, and three for D8). The average diversity between ancestral haplotypes for D3, Id1, and D8 are 0.50, 0.48, and 0.61, respectively. The extra spaces between single nucleotide polymorphism (SNP) sites are based on a block structure of the haplotype inferred by the HMM-based algorithm of Anderson and Novembre (2003). If some ancestral fragments line up with the block boundaries (for D8 the block structure and our model give a similar pattern), for D3 and Id1 some recombination events detected by our model occur within blocks.
|
|
An interesting feature of this analysis is the case of D8. For this gene, the model suggests a low proportion of recombinant haplotypes with regard to Id1 and D3 (88% of the observed haplotypes in D8 directly derived from the three ancestral ones). Note that the persistence of LD with distance has been shown to be more important for D8 than for the two other genes (Remington et al., 2001). Similarly, the estimated values of the recombination rate, namely r, for each gene highlights this difference: r = 5.75 for D3, r = 1.61 for Id1, and r = 0.25 for D8 (these values are given per kilobase pair). This apparent singularity of D8 was also pointed out by Remington et al. (2001) who hypothesized that, due to its role in flowering time variation, D8 may have been under strong divergent selection for adaptation to contrasted environments. Results from previous sections suggest that one of D8 haplotype is highly specific to the Northern Flint group and its derivatives, which should have limited opportunities for recombination with other ancestral haplotypes. Considering this, further efforts are needed to evaluate the length of this haplotype segment by investigating further 5' and 3' regions in the panel of interest. Conversely, the magnitude of reshuffling observed for the other genes supports a high resolution of association mapping. Such modeling efforts therefore appear suitable in parallel to association mapping approaches.
 |
Perspectives and Applications in Breeding
|
|---|
Association genetics and related approaches are expanding very fast in plant genetics, and recently developed models should contribute to a greater efficiency, especially in terms of balance between power and type I error. Application of haplotype modeling presented above in the third section to association genetics and/or LD mapping should help in further improving the approach but is still a matter of research. Due to its predictive feature, it should in principle facilitate the determination of tag single nucleotide polymorphism that best capture the haplotype diversity of a region, make it possible to develop new association tests using an interval mapping strategy (or sliding window approach), and based on this evaluate the local resolution of association or LD mapping. The same approach should also facilitate the analysis of multiparental QTL mapping designs (Blanc et al., 2006; Rebai et al., 1997) by facilitating the a priori grouping of parents into identity-based allele classes at a given genomic location, following the approach proposed by Jansen et al. (2003). All these approaches should contribute to a facilitated identification of alleles of interest and tightly linked markers, which should enhance significantly the efficiency of marker-assisted selection (Bernardo and Charcosset, 2006). Note that, in this context, haplotype modeling could also help in taking decisions during the breeding process itself, in line with the "breeding by design" concept proposed by Peleman and van der Voort (2003).
Received for publication April 7, 2007.
 |
REFERENCES
|
|---|
- Andersen, J.R., T. Schrag, A.E. Melchinger, I. Zein, and T. Lubberstedt. 2005. Validation of Dwarf8 polymorphisms associated with flowering time in elite European inbred lines of maize (Zea mays L.). Theor. Appl. Genet. 111:206–217.[CrossRef][ISI][Medline]
- Anderson, E., and W.L. Brown. 1952. Origin of Corn Belt maize and its genetic significance. p. 124–148. In J.W. Gowen (ed.) Heterosis. Iowa State College Press, Ames, IA.
- Anderson, E.C., and J. Novembre. 2003. Finding haplotype block boundaries by using the minimum-description-length principle. Am. J. Hum. Genet. 73:336–354.[CrossRef][ISI][Medline]
- Balding, D.J. 2006. A tutorial on statistical methods for population association studies. Nat. Rev. Genet. 7:781–791.[CrossRef][ISI][Medline]
- Beaumont, M.A., and R.A. Nichols. 1996. Evaluating loci for use in the genetic analysis of population structure. Proc. R. Soc. Lond. B. Biol. Sci. 263:1619–1626.
- Bernardo, R., and A. Charcosset. 2006. Usefulness of gene information in marker-assisted recurrent selection: A simulation appraisal. Crop Sci. 46:614–621.[Abstract/Free Full Text]
- Blanc, G., A. Charcosset, B. Mangin, A. Gallais, and L. Moreau. 2006. Connected populations for detecting quantitative trait loci and testing for epistasis: An application in maize. Theor. Appl. Genet. 113:206–224.[CrossRef][ISI][Medline]
- Breseghello, F., and M.E. Sorrells. 2006. Association mapping of kernel size and milling quality in wheat (Triticum aestivum L.) cultivars. Genetics 172:1165–1177.[Abstract/Free Full Text]
- Caldwell, K.S., J. Russell, P. Langridge, and W. Powell. 2006. Extreme population-dependent linkage disequilibrium detected in an inbreeding plant species, Hordeum vulgare. Genetics 172:557–567.[Abstract/Free Full Text]
- Camus-Kulandaivelu, L., J.B. Veyrieras, B. Gouesnard, A. Charcosset, and D. Manicacci. 2007. Evaluating the reliability of STRUCTURE outputs in case of relatedness between individuals. Crop Sci. 47:887–890.[Abstract/Free Full Text]
- Camus-Kulandaivelu, L., J.B. Veyrieras, D. Madur, V. Combes, M. Fourmann, S. Barraud, P. Dubreuil, B. Gouesnard, D. Manicacci, and A. Charcosset. 2006. Maize adaptation to temperate climate: Relationship between population structure and polymorphism in the Dwarf8 gene. Genetics 172:2449–2463.[Abstract/Free Full Text]
- Chardon, F., B. Virlon, L. Moreau, M. Falque, J. Joets, L. Decousset, A. Murigneux, and A. Charcosset. 2004. Genetic architecture of flowering time in maize as inferred from quantitative trait loci meta-analysis and synteny conservation with the rice genome. Genetics 168:2169–2185.[Abstract/Free Full Text]
- Ching, A., K.S. Caldwell, M. Jung, M. Dolan, O.S. Smith, S. Tingey, M. Morgante, and A.J. Rafalski. 2002. SNP frequency, haplotype structure and linkage disequilibrium in elite maize inbred lines. BMC Genet. 8:19–33.
- Clark, R.M., E. Linton, J. Messing, and J.F. Doebley. 2004. Pattern of diversity in the genomic region near the maize domestication gene tb1. Proc. Natl. Acad. Sci. USA 101:700–707.[Abstract/Free Full Text]
- Clark, R.M., T.N. Wagler, P. Quijada, and J. Doebley. 2006. A distant upstream enhancer at the maize domestication gene tb1 has pleiotropic effects on plant and inflorescent architecture. Nat. Genet. 38:594–597.[CrossRef][ISI][Medline]
- Crepieux, S., C. Lebreton, B. Servin, and G. Charmet. 2004. Quantitative trait loci (QTL) detection in multicross inbred designs: Recovering QTL identical-by-descent status information from marker data. Genetics 168:1737–1749.[Abstract/Free Full Text]
- Doebley, J., J.D. Wendel, J.S.C. Smith, C.W. Stuber, and M.M. Goodman. 1988. The origin of Cornbelt maize—The isozyme evidence. Econ. Bot. 42:120–131.[ISI]
- Doebley, J.F., M.M. Goodman, and C.W. Stuber. 1986. Exceptional genetic divergence of Northern Flint Corn. Am. J. Bot. 73:64–69.[CrossRef][ISI]
- Dubreuil, P., M. Warburton, M. Chastanet, D. Hoisington, and A. Charcosset. 2006. More on the introduction of temperate maize into Europe: Large-scale bulk SSR genotyping and new historical elements. Maydica 51:281–291.[ISI]
- Evanno, G., S. Regnaut, and J. Goudet. 2005. Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Mol. Ecol. 14:2611–2620.[CrossRef][Medline]
- Fearnhead, P., and P. Donnelly. 2001. Estimating recombination rates from population genetic data. Genetics 159:1299–1318.[Abstract/Free Full Text]
- Flint-Garcia, S.A., J.M. Thornsberry, and E.S. Buckler. 2003. Structure of linkage disequilibrium in plants. Annu. Rev. Plant Biol. 54:357–374.[CrossRef][Medline]
- Flint-Garcia, S.A., A.C. Thuillet, J.M. Yu, G. Pressoir, S.M. Romero, S.E. Mitchell, J. Doebley, S. Kresovich, M.M. Goodman, and E.S. Buckler. 2005. Maize association population: A high-resolution platform for quantitative trait locus dissection. Plant J. 44:1054–1064.[CrossRef][ISI][Medline]
- Garris, A.J., T.H. Tai, J. Coburn, S. Kresovich, and S. McCouch. 2005. Genetic structure and diversity in Oryza sativa L. Genetics 169:1631–1638.[Abstract/Free Full Text]
- Greenspan, G., and D. Geiger. 2004. Model-based inference of haplotype block variation. J. Comput. Biol. 11:495–506.[ISI]
- Gupta, P.K., S. Rustgi, and P.L. Kulwal. 2005. Linkage disequilibrium and association studies in higher plants: Present status and future prospects. Plant Mol. Biol. 57:461–485.[CrossRef][ISI][Medline]
- Hallauer, A.R. 1990. Methods used in developing maize inbreds. Maydica 35:1–16.[ISI]
- Hamblin, M.T., M.G.S. Fernandez, A.M. Casa, S.E. Mitchell, A.H. Paterson, and S. Kresovich. 2005. Equilibrium processes cannot explain high levels of short- and medium-range linkage disequilibrium in the domesticated grass Sorghum bicolor. Genetics 171:1247–1256.[Abstract/Free Full Text]
- Henderson, C.R. 1975. Best linear unbiased estimation and prediction under a selection model. Biometrics 31:423–447.[CrossRef][ISI][Medline]
- Jansen, R.C., J.L. Jannink, and W.D. Beavis. 2003. Mapping quantitative trait loci in plant breeding populations: Use of parental haplotype sharing. Crop Sci. 43:829–834.[Abstract/Free Full Text]
- Jung, M., A. Ching, D. Bhattramakki, M. Dolan, S. Tingey, M. Morgante, and A. Rafalski. 2004. Linkage disequilibrium and sequence diversity in a 500-kbp region around the adh1 locus in elite maize germplasm. Theor. Appl. Genet. 109:681–689.[CrossRef][ISI][Medline]
- Kraakman, A.T.W., R.E. Niks, P. Van den Berg, P. Stam, and F.A. Van Eeuwijk. 2004. Linkage disequilibrium mapping of yield and yield stability in modern spring barley cultivars. Genetics 168:435–446.[Abstract/Free Full Text]
- Larribe, F., and S. Lessard. 2002. Gene mapping via the ancestral recombination graph. Theor. Popul. Biol. 62:215–229.[CrossRef][ISI][Medline]
- Li, N., and M. Stephens. 2003. Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. Genetics 165:2213–2233.[Abstract/Free Full Text]
- Liu, J.S., C. Sabatti, J. Teng, B.J.B. Keats, and N. Risch. 2001. Bayesian analysis of haplotypes for linkage disequilibrium mapping. Genome Res. 11:1716–1724.[Abstract/Free Full Text]
- Liu, K.J., M. Goodman, S. Muse, J.S. Smith, E. Buckler, and J. Doebley. 2003. Genetic structure and diversity among maize inbred lines as inferred from DNA microsatellites. Genetics 165:2117–2128.[Abstract/Free Full Text]
- Matsuoka, Y., Y. Vigouroux, M.M. Goodman, J. Sanchez G., E.S. Buckler, and J. Doebley. 2002. A single domestication for maize shown by multilocus microsatellite genotyping. Proc. Natl. Acad. Sci. USA 99:6080–6084.[Abstract/Free Full Text]
- McPeek, M.S., and A. Strahs. 1999. Assessment of linkage disequilibrium by the decay of haplotype sharing, with application to fine-scale genetic mapping. Am. J. Hum. Genet. 65:858–875.[CrossRef][ISI][Medline]
- Morris, A.P., J.C. Whittaker, and D.J. Balding. 2002. Fine-scale mapping of disease loci via shattered coalescent modeling of genealogies. Am. J. Hum. Genet. 70:686–707.[CrossRef][ISI][Medline]
- Palaisa, K., M. Morgante, S. Tingey, and A. Rafalski. 2004. Long-range patterns of diversity and linkage disequilibrium surrounding the maize Y1 gene are indicative of an asymmetric selective sweep. Proc. Natl. Acad. Sci. USA 101:9885–9890.[Abstract/Free Full Text]
- Palaisa, K.A., M. Morgante, M. Williams, and A. Rafalski. 2003. Contrasting effects of selection on sequence diversity and linkage disequilibrium at two phytoene synthase loci. Plant Cell 15:1795–1806.[Abstract/Free Full Text]
- Parisseaux, B., and R. Bernardo. 2004. In silico mapping of quantitative trait loci in maize. Theor. Appl. Genet. 109:508–514.[ISI][Medline]
- Patterson, N., A.L. Price, and D. Reich. 2006. Population structure and Eigenanalysis. Available at genetics.plosjournals.org/. PLoS Genet. 2:e190.
- Peleman, J.D., and J.R. van der Voort. 2003. Breeding by design. Trends Plant Sci. 8:330–334.[CrossRef][ISI][Medline]
- Pflieger, S., V. Lefebvre, and M. Causse. 2001. The candidate gene approach in plant genetics: A review. Mol. Breed. 7:275–291.[CrossRef]
- Price, A.H. 2006. Believe it or not, QTLs are accurate! Trends Plant Sci. 11:213–216.
- Price, A.L., N.J. Patterson, R.M. Plenge, M.E. Weinblatt, N.A. Shadick, and D. Reich. 2006. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38:904–909.[CrossRef][ISI][Medline]
- Pritchard, J.K., M. Stephens, and P. Donnely. 2000. Inference of population structure using multilocus genotype data. Genetics 155:945–959.[Abstract/Free Full Text]
- Rafalski, A. 2002. Applications of single nucleotide polymorphisms in crop genetics. Curr. Opin. Plant Biol. 5:94–100.[CrossRef][ISI][Medline]
- Rafalski, A., and M. Morgante. 2004. Corn and humans: Recombination and linkage disequilibrium in two genomes of similar size. Trends Genet. 20:103–111.[CrossRef][ISI][Medline]
- Rebai, A., P. Blanchard, D. Perret, and P. Vincourt. 1997. Mapping quantitative trait loci controlling silking date in a diallel cross among four lines of maize. Theor. Appl. Genet. 95:451–459.[CrossRef][ISI]
- Rebourg, C., M. Chastanet, B. Gouesnard, C. Welcker, P. Dubreuil, and A. Charcosset. 2003. Maize introduction into Europe: The history reviewed in the light of molecular data. Theor. Appl. Genet. 106:895–903.[ISI][Medline]
- Remington, D.L., J.M. Thornsberry, Y. Matsuoka, L.M. Wilson, S.R. Whitt, J. Doebley, S. Kresovich, M.M. Goodman, and E.S. Buckler. 2001. Structure of linkage disequilibrium and phenotypic associations in the maize genome. Proc. Natl. Acad. Sci. USA 98:11479–11484.[Abstract/Free Full Text]
- Ritland, K. 1996. Estimators for pairwise relatedness and individual inbreeding coefficients. Genet. Res. 67:175–185.[ISI]
- Salvi, S., G. Sponza, M. Morgante, D. Tomes, X. Niu, K.A. Fengler, R. Meeley, E.V. Ananiev, S. Svitashev, E. Bruggemann, B. Li, C.F. Hainey, S. Radovic, G. Zaina, J.A. Rafalski, S.V. Tingey, G.-H. Miao, R.L. Phillips, and R. Tuberosa. 2007. Conserved noncoding genomic sequences associated with a flowering-time quantitative trait locus in maize. Proc. Natl. Acad. Sci. USA 104:11376–11381.[Abstract/Free Full Text]
- Salvi, S., and R. Tuberosa. 2005. To clone or not to clone plant QTLs: Present and future challenges. Trends Plant Sci. 10:297–304.[CrossRef][ISI][Medline]
- Semon, M., R. Nielsen, M.P. Jones, and S.R. McCouch. 2005. The population structure of African cultivated rice Oryza glaberrima (Steud.): Evidence for elevated levels of linkage disequilibrium caused by admixture with O. sativa and ecological adaptation. Genetics 169:1639–1647.[Abstract/Free Full Text]
- Smith, B.D. 1989. Origins of agriculture in eastern North America. Science 246:1566–1571.[Abstract/Free Full Text]
- Song, Y.S., and J. Hein. 2005. Constructing minimal ancestral recombination graphs. J. Comput. Biol. 12:147–169.[CrossRef][ISI][Medline]
- Stich, B., A.E. Melchinger, M. Frisch, H.P. Maurer, M. Heckenberger, and J.C. Reif. 2005. Linkage disequilibrium in European elite maize germplasm investigated with SSRs. Theor. Appl. Genet. 111:723–730.[CrossRef][ISI][Medline]
- Stich, B., H.P. Maurer, A.E. Melchinger, M. Frisch, M. Heckenberger, J.R. van der Voort, J. Peleman, A.P. Sorensen, and J.C. Reif. 2006. Comparison of linkage disequilibrium in elite European maize inbred lines using AFLP and SSR markers. Mol. Breed. 17:217–226.[CrossRef]
- Storey, J.D., and R. Tibshirani. 2003. Statistical significance for genomewide studies. Proc. Natl. Acad. Sci. USA 100:9440–9445.[Abstract/Free Full Text]
- Szalma, S.J., E.S. Buckler, M.E. Snook, and M.D. McMullen. 2005. Association analysis of candidate genes for maysin and chlorogenic acid accumulation in maize silks. Theor. Appl. Genet. 110:1324–1333.[CrossRef][ISI][Medline]
- Tang, H., J. Peng, P. Wang, and N.J. Risch. 2005. Estimation of individual admixture: Analytical and study design considerations. Genet. Epidemiol. 28:289–301.[CrossRef][ISI][Medline]
- Tenaillon, M.I., M.C. Sawkins, A.D. Long, R.L. Gaut, J.F. Doebley, and B.S. Gaut. 2001. Patterns of DNA sequence polymorphism along chromosome 1 of maize (Zea mays ssp. mays L.). Proc. Natl. Acad. Sci. USA 98:9161–9166.[Abstract/Free Full Text]
- Thornsberry, J.M., M.M. Goodman, J. Doebley, S. Kresovich, D. Nielsen, and E.S. Buckler. 2001. Dwarf8 polymorphisms associate with variation in flowering time. Nat. Genet. 28:286–289.[CrossRef][ISI][Medline]
- Veyrieras, J.B. 2006. Etude du déterminisme génétique de caractères quantitatifs chez les végétaux : méta-analyse de QTL et génétique d'association. Ph.D. diss. INAPG, Paris.
- Veyrieras, J.B., B. Goffinet, and A. Charcosset. 2007. MetaQTL: A package of new computational methods for the meta-analysis of QTL mapping experiments. BMC Bioinformatics 8(1):49.
- Whitt, S.R., L.M. Wilson, M.I. Tenaillon, B.S. Gaut, and E.S. Buckler, IV. 2002. Genetic diversity and selection in the maize starch pathway. Proc. Natl. Acad. Sci. USA 99:12959–12962.[Abstract/Free Full Text]
- Wilson, L.M., S.R. Whitt, A.M. Ibanez, T.R. Rocheford, M.M. Goodman, and E.S. Buckler. 2004. Dissection of maize kernel composition and starch production by candidate gene association. Plant Cell 16:2719–2733.[Abstract/Free Full Text]
- Wu, B., N. Liu, and H. Zhao. 2006. PSMIX: An R package for population structure inference via maximum likelihood method. BMC Bioinformatics 7.
- Yu, J.M., and E.S. Buckler. 2006. Genetic association mapping and genome organization of maize. Curr. Opin. Biotechnol. 17:155–160.[ISI][Medline]
- Yu, J.M., G. Pressoir, W.H. Briggs, I.V. Bi, M. Yamasaki, J.F. Doebley, M.D. McMullen, B.S. Gaut, D.M. Nielsen, J.B. Holland, S. Kresovich, and E.S. Buckler. 2006. A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat. Genet. 38:203–208.[CrossRef][ISI][Medline]
- Zhang, Y.-M., Y. Mao, C. Xie, H. Smith, L. Luo, and S. Xu. 2005. Mapping quantitative trait loci using naturally occurring genetic variance among commercial inbred lines of maize (Zea mays L.). Genetics 169:2267–2275.[Abstract/Free Full Text]
- Zhao, K., M. Aranzana, J.S. Kim, C. Lister, C. Shindo, C. Tang, C. Toomajian, H. Zheng, C. Dean, P. Marjoram, and M. Nordborg. 2007. An Arabidopsis example of association mapping in structured samples. Available at genetics.plosjournals.org/. PLoS Genet. 3:e4.
- Zollner, S., and J.K. Pritchard. 2005. Coalescent-based association mapping and fine mapping of complex trait loci. Genetics 169:1071–1092.[Abstract/Free Full Text]