Crop Science Grow Your Career with CSSA
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Published online 6 May 2005
Published in Crop Sci 45:1004-1016 (2005)
© 2005 Crop Science Society of America
677 S. Segoe Rd., Madison, WI 53711 USA
This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Related articles in Crop Science
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (8)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Yan, W.
Right arrow Articles by Tinker, N. A.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Yan, W.
Right arrow Articles by Tinker, N. A.
Agricola
Right arrow Articles by Yan, W.
Right arrow Articles by Tinker, N. A.
Related Collections
Right arrow Crop Ecology
Right arrow Other Grain Crops
Right arrow Biometrics

CROP BREEDING, GENETICS & CYTOLOGY

An Integrated Biplot Analysis System for Displaying, Interpreting, and Exploring Genotype x Environment Interaction

Weikai Yan*,1 and Nicholas A. Tinker

Eastern Cereal and Oilseed Research Center, Agriculture and Agri-Food Canada, 960 Carling Ave., Ottawa, Ontario, Canada, K1A 0C6

* Corresponding author (wyan{at}ggebiplot.com; yanw{at}agr.gc.ca)


    ABSTRACT
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Multienvironment trials (MET) generate two types of two-way data: genotype x environment data for a target trait and genotype x trait data in individual or across environments. These data can be visually analyzed by a GGE biplot and a genotype x trait biplot, respectively. This paper describes a third type of biplot, the covariate-effect biplot, and illustrates its tandem use with the other biplots to achieve a fuller understanding of MET data. The covariate-effect biplot is generated on the basis of an explanatory trait x environment two-way table consisting of correlation coefficients between the target trait (e.g., yield) and each of the other traits in each of the environments. This biplot displays the yield-trait relations in individual environments and addresses whether and how the genotype x environment interactions (GE) for yield can be explored by indirect selection for the other traits. These other traits are treated as genetic covariables and can be replaced by other genetic covariables such as genetic markers, QTL, or genes. The biplot methodology was demonstrated by MET data of barley (Hordeum vulgare L.) conducted across North America. Both the GGE biplot and the covariate-effect biplot showed that the environments fell into two (eastern vs. western) megaenvironments. The covariate-effect pattern explained 81% of the GGE pattern, suggesting that the GE pattern for yield can be effectively explored by indirect selection for these traits. Specifically, barley yield can be improved by selecting for larger kernel weight, earlier heading, and better lodging resistance in the eastern megaenvironment. In contrast, the yield–trait relationship in the western megaenvironment was highly variable, and yield improvement can be achieved only by selecting for yield per se across environments. We suggest that the GGE biplot, the genotype x trait biplot, and the covariate-effect biplot be used jointly to better understand and more fully explore MET data.

Abbreviations: AMMI, Additive main effect and multiplicative interaction • GE, genotype x environment interaction • GGE, genotype main effect plus genotype x environment interaction • MET, multi-environment trials • PC, principal component(s) • PCA, principal component analysis • PLSR, partial least squares regression • SVD, singular value decomposition


    INTRODUCTION
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
FOLLOWING the proposal of Gabriel (1971), biplots have been increasingly used in the analysis of MET (Bradu and Gabriel, 1978; Kempton, 1984; Zobel et al., 1988; Gauch, 1992; Cooper et al., 1997). Typically, MET data can be categorized as genotype x environment data for a trait (e.g., yield) and genotype x trait data in individual environments. Biplots are an effective tool in visual analysis of both types of two-way data. A GGE biplot, which simultaneously displays the genotype main effect (G) and the GE of a genotype x environment two-way table (Yan et al., 2000; Yan, 2001; Yan and Kang, 2003), can visually address many questions relative to cultivar and test environment evaluation. On the basis of a single GGE biplot, cultivars can be evaluated for their performance in individual environments and across environments, mean performance and stability, and general or specific adaptations. Simultaneously, environments can be visually evaluated and grouped on the basis of their ability to discriminate among genotypes and their representativeness of other test environments. Redundant environments, as well as those that are most appropriate for selecting superior genotypes or culling inferior genotypes, can be visually identified. In addition, a GGE biplot can reveal the "which-won-where" pattern of a MET data, which is important for megaenvironment identification and for cultivar recommendations specific to each megaenvironment.

A genotype x trait biplot (Yan and Rajcan, 2002; Yan and Kang, 2003; Lee et al., 2003) graphically approximates a genotype x trait two-way table. Such a biplot can be used to visualize the genetic correlations among traits (breeding objectives), which facilitates a systems understanding of the crop. Understanding the trait relationships also facilitates identification of traits that can be used in indirect selection for a target trait and those that may be redundantly measured. A genotype x trait biplot can also be used to visualize the merits and shortcomings of individual genotypes, which is important for both cultivar evaluation and parent selection.

In spite of their many useful features, genotype x environment biplots have been criticized for not being able to incorporate information on genetic covariables that may explain the GE patterns. To amend this, factorial regression using genetic and/or environmental covariables has been suggested to complement biplot analysis (van Eeuwijk et al., 1996; Vargas et al., 1999; Brancourt-Hulmel and Lecomte, 2003). In this regard, the partial least squares regression (PLSR) plot provides a more attractive approach. It displays genotypes, environments, and genetic covariables in a single PLSR plot (Vargas et al., 1998, 1999; Crossa et al., 1999), which, therefore, may be referred to as a "tri-plot." A PLSR tri-plot appears to combine a genotype x environment biplot and a genotype x trait biplot by using the same set of genotype scores. The genotype scores in a PLSR tri-plot are selected such that the variation from both tables displayed by the tri-plot is maximized. A tri-plot is supposed to have interpretations of both the genotype x environment biplot and the genotype x trait biplot. Moreover, by combining two biplots, the tri-plot brings genetic covariables (e.g., genetic values of some traits) and environments in the same plot, which may allow the genotype x environment patterns for the target trait to be interpreted relative to trait x environment interactions. A genotype x environment biplot displays the most important patterns of the genotype x environment data; a genotype x trait biplot displays the most important patterns of the genotype x trait table; a PLSR tri-plot, however, may fail to adequately display both the genotype x environment patterns and the genotype x trait patterns, and as a result, it may have limited use in interpreting the genotype x environment patterns. This occurs when many irrelevant traits are included in the genotype x trait table. Tri-plots can also be constructed from other multivariate analyses such as redundancy analysis and canonical correspondence analysis (A.F. Zuur, http://www.brodgar.com/ordination.htm; verified 6 January 2005).

Yan and Hunt (2001) presented another approach for incorporating genetic and environmental covariables in MET data analysis, in which the observed GGE patterns were explained as interactions between genetic covariables and environmental covariables. This was achieved by relating the genetic–environmental covariables to the genotypic–environmental scores of the first two principal components derived from GGE biplot analysis. These correlation coefficients could be plotted to form a genetic covariable vs. environmental covariable biplot or superimposed on the GGE biplot so that the interactions between them can be visualized.

A full understanding of MET data encompasses (i) the GGE patterns of a target trait, (ii) the genotype x trait patterns in individual environments or across environments, and (iii) whether and how the GGE patterns for a target trait can be explained and exploited using other traits. The purpose of this paper is to describe a "covariable-effects biplot" that can be used to interpret and explore the GGE patterns of the target trait relative to genetic covariables (genetic values of explanatory traits, QTL, or genes). This biplot, together with the previously described GGE biplot and genotype x trait biplot, constitutes an integrated biplot system for MET data analysis.


    MATERIALS AND METHODS
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Data Source
Data used in this study come from the database of a North America Barley Genomics project, the Harrington x Tr306 mapping population (Kasha et al., 1995; Tinker et al., 1996). Harrington has been an important malting barley cultivar in North America and Tr306 a line poor in malting quality but with other merits. One hundred-fifty random doubled haploid lines, referred to as genotypes hereafter, were determined for grain yield and 21 other agronomic or quality traits in up to 28 environments across Canada and the U.S. Northwest in 1992 and 1993. The data used in this study involve 145 genotypes, with yield data from 25 individual environments and mean values for other traits across environments. The yield data in individual environments are regarded as response variables and the other traits as interpretive variables. A generalized data structure is shown in Table 1.


View this table:
[in this window]
[in a new window]
 
Table 1. Data structure for the analysis of covariable effects with genetic values of explanatory traits as interpretive variables and yield in different environments as response variables.

 
Generating a GGE Biplot
To generate a GGE biplot (Yan et al., 2000), the genotype x environment two-way table of yield was first environment-standardized; the environment-standardized table was then decomposed into principal components (PC) via singular value decomposition (SVD). The first two PC (PC1 and PC2) were used to generate a GGE biplot, whereas the rest were regarded as residuals as follows:

[1]
where yij is the yield of genotype i in environment j; ßj is the mean yield in environment j; sj is the standard deviation in environment j; {lambda}1 and {lambda}2 are the singular values of PC1 and PC2, respectively; {xi}i1 and {xi}i2 are the eigenvectors of genotype i for PC1 and PC2, respectively; {eta}1j and {eta}2j are the eigenvectors of environment j for PC1 and PC2, respectively; and {epsilon}ij is the residual associated with genotype i and environment j. To generate a GGE biplot, Eq. [1] was reorganized as follows:

[2]
by assigning

[3]
where l = 1, 2. The terms gil are referred to as PC scores for genotype i, and elj as PC scores for environment j. Equation [3] represents the environment-focused scaling and is appropriate for visualizing the relationship among environments for their ability to differentiate the genotypes. Although there are two other models for generating a GGE biplot (Yan et al., 2000; Yan and Kang, 2003), this model is most appropriate for environment classification with the assumption that all environments are equally important in genotype evaluation, which is consistent with the assumption for a covariate-effect biplot described below.

Generating a Genotype x Trait Biplot
The procedures of generating a genotype x trait biplot are the same as described above for the GGE biplot except that environments are replaced with traits. Such a biplot facilitates visualization of the genetic correlations among traits (Yan and Rajcan, 2002; Lee et al., 2003) and evaluation of the genotype on the basis of multiple traits (Yan and Kang, 2003). Trait-focused singular-value partitioning (Eq. [3]) was used for appropriate visualization of the genetic correlations among traits.

Generating a Covariate-Effect Biplot
The complete dataset used in this study was a two-way table of 145 rows for the genotypes and 21 columns for the explanatory traits plus 25 columns for the yield data in each environment (Table 1). In attempting to explain the GE of yield relative to explanatory traits, a 21 by 25 trait x environment two-way table was first constructed, which contains correlation coefficients between yield and the genetic values of each trait in each of the environments. The correlation coefficients were used as a measure of the effects of the explanatory traits on yield. If the genetic values of the traits are generalized as genetic covariables, the correlation table may be referred to as a table of genetic covariable effects. After the covariate-effect table was constructed, the traits were screened for their relevance; a trait was regarded as relevant if it had a significant association (P < 0.05) with yield in at least one of the environments. The covariate-effect table with eight traits that survived the screening is presented in Table 2.


View this table:
[in this window]
[in a new window]
 
Table 2. Trait x environment two-way table of covariable effects. Each value is the correlation coefficient of yield with the relevant trait in the relevant environment.

 
A biplot was constructed to visualize the covariate-effect table. The table was decomposed into PC via SVD, and the first two PC were used to generate a covariate-effect biplot, whereas the rest were regarded as residuals as follows:

[4]
where rij is the correlation coefficient between yield and trait i in environment j; {lambda}1 and {lambda}2 are the singular values of PC1 and PC2, respectively; {xi}i1 and {xi}i2 are the eigenvectors of trait i for PC1 and PC2, respectively; {eta}1j and {eta}2j are the eigenvectors of environment j for PC1 and PC2, respectively; and {epsilon}ij is the residual associated with trait i and environment j. To generate a covariate-effect biplot, the singular values ({lambda}1 and {lambda}2) were partitioned between the trait and the environment eigenvectors so that Eq. [4] could be written as

[5]
where mi1 and mi2 are referred to as the PC1 and PC2 scores for trait i, respectively, which define the position of trait i in a two-dimensional biplot. Likewise, e1j and e2j are the PC1 and PC2 scores for environment j, respectively, and define the position of environment j in the biplot. Singular-value partitioning was implemented by assigning

[6]
where fl is the partition factor for PCl (l = 1 and 2). When fl = 1, it is referred to as trait-focused scaling, which is appropriate for visual comparison among, or grouping of, the traits. When fl = 0, it is environment-focused singular-value partitioning and is appropriate for visual comparison among, or grouping of, the environments (Yan, 2002; Yan and Kang, 2003). The two scaling methods are equally valid in recovering the trait x environment covariate-effect table.

Congruency between the GGE Pattern and the Covariate-Effect Pattern
Calculating the congruency coefficient between the GGE pattern in the GGE biplot and the covariate-effect pattern in the covariate-effect biplot consisted of two steps. The first step was to calculate the distance matrices among environments in the two biplots. The distance between two environments was calculated as

xi and xj being the PC1 scores and yi and yj the PC2 scores of the two environments on the basis of environment-focused singular-value partitioning. The second step was to calculate the correlation between two distance matrices, which is a measure of the congruency between the two biplots. All analyses reported in this paper were conducted using the GGEbiplot software (Yan, 2001; Yan and Kang, 2003; www.ggebiplot.com; verified 6 January 2005).


    RESULTS
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Megaenvironment Differentiation Based on the GGE Biplot
Figure 1 illustrates the GGE biplot based on the 145-genotype x 25-environment two-way table of yield. Because environment-standardized data are used (Eq. [1]), all environments are assumed to be equally important in genotype evaluation. Environment-focused singular-value partitioning (Eq. [3]) allows appropriate visualization of the relationships among environments.



View larger version (21K):
[in this window]
[in a new window]
 
Fig. 1. GGE biplot based on yield data of 145 doubled haploid lines (presented as dots) in 25 environments. The environments are designated as a province/state code plus year code plus a letter to differentiate different locations in a given province. AB: Alberta; AK: Alaska; MB: Manitoba; MT: Montana; ND: North Dakota; ON: Ontario; PE: Prince Edward Island; QC: Quebec; SK: Saskatchewan; WA: Washington.

 
Among other things, the GGE biplot revealed two nonoverlapping clusters of environments. The cluster on the top included all environments from Alberta (AB), most environments from Saskatchewan (SK), and environments from North Dakota (ND), Washington state (WA), and Alaska (AK). Since this cluster of environments involved mostly locations from the western part of North America, it can be referred to as the western megaenvironment. The cluster on the right includes all environments from Ontario (ON), Prince Edward Island (PE), and Manitoba (MB). This cluster can, therefore, be referred to as the eastern megaenvironment. Exceptions included two SK environments (SK93A and SK93B) and the Montana environment (MT93) that fell into the eastern megaenvironment and one Quebec environment (QC92) that fell into the western megaenvironment, reflecting the genotype x year x location interactions. The environment from Montana (MT93) fell in between the two clusters.

This biplot explained only 31% of the GGE, implying that the GE for yield in this dataset was complex. A biplot of PC3 vs. PC4 (not shown) did not reveal any discernible patterns, indicating that the GGE biplot of PC1 vs. PC2 adequately displayed the GGE patterns.

Effect of Genetic Covariables on Yield in Different Environments
The covariate-effect biplot based on the trait x environment table of correlations including 21 traits is presented in Fig. 2a. It explained 81% of the total variation of the covariate-effect table and is, therefore, a good approximation of it. Because it is based on trait-focused singular-value partitioning, it is appropriate for visualizing the effects of the traits on yield as well as the similarities among traits in response to the environment. The rays connecting the traits to the biplot origin are referred to as trait vectors. The vector length of a trait measures the magnitude of its effect (positive or negative) on yield. Kernel weight, days to maturity, days to heading, and lodging score had relatively long vectors, suggesting that they had relatively large effects on yield in one or more environments. In contrast, most quality traits such as amylase activity, soluble protein content, soluble/total protein ratio, diastatic power, extraction difference, etc., had short vectors, suggesting that they had little association with yield in all environments. When traits not significantly (P < 0.05) associated with yield in any of the environments were removed before biplot analysis, Fig. 2a was reduced to Fig. 2b. A comparison between the two biplots reveals that it is the traits with short vectors that were removed. This provides an empirical support to the statement that the vector length of a trait is a measure of its effect on yield. The numerical data on which Fig. 2b was based are presented in Table 2.





View larger version (52K):
[in this window]
[in a new window]
 
Fig. 2. Trait x environment biplots based on trait x environment two-way tables of correlation coefficients between traits and yield in each of the environments. (a) Biplot involving all 21 traits, based on trait-focused singular-value partitioning; (b) biplot involving 8 traits that were significantly correlated with yield (P < 0.05) in at least one environment, based on trait-focused singular-value partitioning; and (c) same biplot as (b) but based on environment-focused singular-value partitioning. Abbreviations for the environments are the same as in Fig. 1. Abbreviations for the traits are: AMY: amylase activity; BGL: Beta-glucan content; DS: Diastatic Power (measures enzyme activity for converting starch to sugar); HEADING: days to heading; HEIGHT: plant height; KW: kernel weight; LODGING: lodging score; MATURITY: days to maturity; MBG: Malt Beta Glucan (undesirable for beer); PKW: weight of plump kernels; PLM: plumpness; PM: powdery mildew susceptibility; PMGROUP: powdery mildew susceptibility group; PRO: protein content in grain; PSL: soluble protein content (related to enzyme content); PST: soluble/total protein ratio; TSWT: test weight; VSCO: Visual score in the field; X70: extraction at 70°C; XDF: extraction difference between X70 and XFI; and XFI: fine extraction.

 
The cosine of the angle between the vectors of two traits measures the similarity between them relative to their effects on yield. Thus, plumpness, test weight, and protein content, had acute (<90°) angles with kennel weight, indicating that their effects on yield were similar to that of kennel weight. On the contrary, lodging score and days to heading had obtuse (>90°) angles with kennel weight, indicating that their effects on yield were opposite to that of kennel weight. Powdery mildew susceptibility had a short vector, indicating that its effects on yield were relatively minor and that it was not similar to any other traits relative to its effect on yield. Days to maturity had a near-right angle with all traits except lodging score, indicating that its effect on yield was more or less independent of that of other traits except lodging score; its effect on yield tended to be opposite to that of lodging score.

Megaenvironment Identification Based on the Covariate-Effect Biplot
Figure 2c represents the same biplot as Fig. 2b but is based on environment-focused singular-value partitioning. It is, therefore, more appropriate for visualizing the relationship among environments relative to yield-trait relations. Interestingly, the 25 environments fell into two non-overlapping clusters, whose members happen to be the same as the two megaenvironments revealed in the GGE biplot (Fig. 1). The congruency between the GGE pattern (Fig. 1) and the covariate-effect pattern (Fig. 2c) is 0.904, indicating that the response of trait-yield relations to the environment explained 0.9042 {approx} 81% of the observed GGE pattern. In other words, the observed GGE pattern can be effectively explained by the covariate-effect pattern, which implies that the observed GE in the GGE biplot can be effectively exploited by developing trait-selection strategies specific to each megaenvironment.

Strategies of Indirect Selection for Different Megaenvironments
Dividing the target environments into meaningful megaenvironments and deploying different cultivars for different megaenvironments is the only way that positive GE can be exploited, and negative GE avoided. Evidence of the eastern vs. western megaenvironment differentiation in the GGE biplot (Fig. 1) implies that different cultivars should be selected and deployed for the two megaenvironments. Evidence of the same megaenvironment differentiation in the covariate-effect biplot (Fig. 2c) suggests that different trait-selection strategies can be developed in breeding for higher yield for each megaenvironment.

Figure 3a represents the covariate-effect biplot involving only the eastern locations (Prince Edward Island, Ontario, Quebec, and Manitoba). All traits except powdery mildew susceptibility and days to maturity showed consistent effects on yield across environments. Specifically, kernel weight, test weight, plumpness, and protein content had consistently positive associations with yield, as evidenced by the acute angles between the vectors of these traits and the vectors of all environments except QC92. On the contrary, lodging score and days to heading (not days to maturity, though), showed consistently negative associations with yield, as evidenced by the obtuse angles between their vectors and the vectors of all environments except QC92. Therefore, selection for larger kernel weight, earlier heading, and better lodging-resistance should lead to increased yield in the eastern megaenvironment. The correlation coefficients between yield and these three traits are summarized in Table 3 for each environment and megaenvironment.




View larger version (39K):
[in this window]
[in a new window]
 
Fig. 3. Trait x environment biplots for each megaenvironment: (a) environments from the eastern locations, and (b) environments from the western locations. See Fig. 1 and 2 for environment and trait abbreviations.

 

View this table:
[in this window]
[in a new window]
 
Table 3. Correlation coefficients between yield and genetic values of three traits in the two megaenvironments.

 
Figure 3b represents the covariate-effect biplot for the western locations (Alberta, Saskatchewan, Washington, Montana, and Alaska). First, the environments took both positive and negative values for both axes, implying that the trait-yield relations varied dramatically among environments within this megaenvironment. Consequently, no single trait had positive effects on yield in all environments. Second, environments from the same locations (Alberta and Saskatchewan) were scattered across the biplot, indicating that the environments cannot be further divided into repeatable subgroups. Therefore, the western megaenvironment is a single megaenvironment with large unpredictable trait-yield relations. As a result, it does not seem feasible to improve yield by selecting for any of the traits; selection for yield per se in multiple environments seems to be the only way to improve yield in this megaenvironment.

Genetic Correlations among Traits
Despite the fact that the covariate-effect patterns differed dramatically in the two megaenvironments, the relationships among traits relative to their effects on yield were more or less similar. That is, in both megaenvironments (Fig. 3a vs. Fig. 3b), as well as across all environments (Fig. 2b), kernel weight, test weight, protein content, and plumpness constitute one group of traits with similar effects on yield (indicated by the acute angles among them); days to heading and lodging score constitute another group. The effects of the two groups were more or less opposite, however, as indicated by the obtuse angles between them. Genetic correlation among traits may be underlying reason for this. Figure 4 represents the genotype x trait biplot based on the genotype x trait two-way table, which contains the genetic values of the traits for each genotype. This biplot, therefore, approximately displays the genetic correlations among the traits. The eight traits fell into three relatively independent groups: kernel weight, test weight, protein content, and plumpness constitute one group with positive associations among them. Days to heading, days to maturity, and powdery mildew susceptibility constitute another group of positively associated traits. The latter group can be explained by the QTL mapping results, which reveal a strong QTL for both days to heading and days to maturity in the middle of chromosome 4, and a major gene for powdery mildew resistance ("dMlg") in the same region (Tinker et al., 1996). Lodging score is negatively associated with both groups of traits.



View larger version (17K):
[in this window]
[in a new window]
 
Fig. 4. Genotype by trait biplot showing the genetic correlations among eight traits across genotypes. The 145 doubled haploid lines are represented by dots. See Fig. 2 for trait abbreviations.

 

    DISCUSSION
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Barley Megaenvironments in North America
We have demonstrated that tandem use of three types of biplots, namely, the GGE biplot, the genotype x trait biplot, and the covariate-effect biplot, facilitates revealing, understanding, and exploring GE. Although the barley dataset was used only as an example for demonstrating the biplot analysis methodology, the finding that the barley testing environments in North America can be divided into two megaenvironments with distinct yield-trait relations has practical implications. It implies that different barley varieties should be selected and different selection strategies should be used for the eastern vs. western megaenvironments. This finding differs from that of Atlin et al. (2000), who analyzed the yield data from the same barley genomics study but concluded that there was no evidence for dividing barley growing areas in Canada into different megaenvironments. The apparent discrepancy between the current study and Atlin et al. (2000) arose from the difference in treating the Manitoba test site. In Atlin et al. (2000), the Manitoba test site was grouped into the western locations on the basis of the traditional breakdown of Canadian barley breeding programs, even though variety performance at this site was actually more similar to that at the eastern locations. Since the Manitoba site was similar to the eastern locations from Ontario, Quebec, and Prince Edward Island but distinct from the western locations from Alberta and Saskatchewan (Fig. 1), its grouping into the western locations obscures the distinct variety responses in the two megaenvironments. The two megaenvironments had a near-right angle in the GGE biplot (Fig, 1), suggesting more or less independent variety responses, which is supported by the results in Atlin et al. (2000) that phenotypic correlations between the two megaenvironments were no greater than 0.30 (their Table 4). This discussion illustrates the power of GGE biplot in revealing patterns and in generating hypotheses on megaenvironments.


View this table:
[in this window]
[in a new window]
 
Table 4. Data structure for the analysis of covariable effects with environmental factors as interpretive variables and environment-centered or standardized yield for different genotypes as response variables.

 
Another related study is Romagosa et al. (1996), who studied barley megaenvironments in North America on the basis of QTL x environment interactions, using genetic and phenotypic data of the ‘Steptoe’ x ‘Morex’ mapping population. The western vs. eastern megaenvironment differentiation was not obvious in this dataset, although three Saskatchewan sites and one Manitoba site were clearly separated from other sites, mainly because of the differential effects of a single QTL located between markers abc156a and abg358 (biplots not shown). Since reliable megaenvironment classification requires a diverse set of genotypes to be tested in a representative set of environments, it is not surprising that different studies led to different results.

GGE Biplot vs. Other Genotype x Environment Biplots
Although we have recommended using a GGE biplot in studying genotype x environment tables, there are other types of biplots that have been used in studying such data (DeLacy et al., 1996). All biplots can be useful as long as they are interpreted correctly. Some comparisons of six types of GE-containing biplots are briefly given below.

  1. The COMM biplot, based on the completely multiplicative model (i.e., SVD of the original two-way table without centering). This biplot contains E, G, and GE, and the patterns are often obscured by the grand mean. Equation [4] actually represents the completely multiplicative model. It is useful for visualizing the real data but is effective only when the grand-mean is close to 0, as in the case of the covariate-effect table.
  2. The SHMM biplot, based on the shifted multiplicative model (Cornelius et al., 1996). This biplot also displays a mixture of E, G and GE, but the patterns can be obscured by "the shifting factor."
  3. The PCA biplot, based on SVD of grand mean-centered genotype x environment data (Zobel et al., 1988). This biplot contains E, G, and GE, with E often predominating over G and GE.
  4. The GE biplot or AMMI2 biplot, based on SVD of double-centered genotype x environment data (Gauch, 1992). Since G and E are removed before SVD, it displays GE only.
  5. The AMMI1 biplot, plotting genotype and environment main effects against interactive PC1 scores (Gauch, 1992). Like the PCA biplot, the AMMI1 biplot displays E, G, and GE but usually explains more variation. It does not have the interpretation of a normal biplot, however.
  6. The GGE biplot, based on SVD of environment-centered or standardized genotype x environment table. As its name suggests, the GGE biplot displays G and GE (Yan et al., 2000).

Since cultivar evaluation and megaenvironment classification must be based on both G and GE (Gauch and Zobel, 1996), all biplots that display both G and GE can be useful for this purpose. All biplots listed above, except the GE biplot, contain G and some GE. The GE biplot is most powerful for studying GE per se, but cannot be used in selecting superior cultivars because it excludes G. Since a GGE biplot displays the maximum G+GE of all biplots and has many convenient interpretations, it is considered to be the most appropriate biplot for cultivar evaluation (Yan et al., 2000; Crossa et al., 2002). One attractive feature of the GGE biplot in cultivar evaluation is to graphically show the ‘which-won-where’ pattern of a genotype x environment two-way data (Yan et al., 2000). All other biplots can at best approximate the GGE biplot in this regard. The AMMI1 biplot, although effective in summarizing the genotype x environment data, cannot show ‘which-won-where’ because it does not have the inner-product property of a normal biplot. Some graphs (not biplots) based on AMMI analysis do address the "which-won-where" issue (Gauch and Zobel, 1997). For a given dataset, all types of biplots listed above except the SHMM biplot can be readily generated and visualized using the GGEbiplot software.

Four Questions Must Be Asked before Attempting to Interpret a Biplot
Four questions must be asked before trying to interpret a biplot. First, what is the model on which the biplot is based? This determines the type of questions that can be potentially addressed by the biplot. For example, a GGE biplot can be used in cultivar evaluation and recommendation, whereas a GE biplot cannot. Second, what is the singular-value partitioning method used in generating the biplot? This determines whether the biplot is appropriate for visualizing the relationships among genotypes or those among environments (or traits). Third, how much variation is explained by the biplot? This determines the credibility of the biplot; interpretations about entries and testers with short vectors may not be accurate if the biplot explained only a small fraction of the total variation. Last but not least, are the biplot axes drawn to scale? If not, the relations displayed by the biplot are distorted and the interpretations can be misleading. Biplot analysis reported in this paper was conducted using the GGEbiplot software, which explicitly addresses these questions and ensures that correct biplots are generated for particular purposes. GGEbiplot is user-friendly, feature-rich software for biplot analysis; it was developed for researchers with limited training in statistics and computer application.

GGE Patterns vs. GE Patterns
We have referred to the patterns observable from a GGE biplot as GGE patterns. With careful interpretations, however, meaningful statements about G and GE can be made from such patterns while maintaining the advantages of simultaneous display. The patterns that can be visualized in a GGE biplot include patterns regarding the genotypes, patterns regarding the environments, and patterns regarding both genotypes and environments (e.g., the which-won-where pattern). The genotype patterns are attributable to a mixture of G and GE. This is why a GGE biplot allows visualizing both mean performance and stability of the genotypes. Although G and GE are confounded in the GGE biplot, it is possible to distinguish patterns due to G from those due to GE. The best way is to use the average-environment coordination (AEC) view of the GGE biplot (Yan, 2001): the AEC-abscissa represents variation due to G and the AEC-ordinate presents variation due to GE. If there is no GE, all genotypes would fall on the AEC-abscissa, which would be parallel to the PC1 axis, as PC1 would explain 100% of the total variation of the environment-centered or standardized genotype x environment data. Any deviation from this pattern is due to GE. On the other hand, if there is little G in the data, both genotypes and environments would be scattered in the biplot in all directions. Typically, PC1 scores are highly correlated with G if G is >20% of G+GE (Yan et al., 2001). Thus, if G is sizable, PC1 will be dominated by G and all other PC dominated by GE. If G is trivial, all PC should be dominated by GE. In either case, anything not explained in the GGE biplot is mostly GE. If the GGE biplot explains a relatively small proportion of the total GGE, the GE in the data must be complex.

It is relevant here to emphasize that the distinction between G and GE may be meaningful only within the scope of environments in which they are estimated. The so-called G estimated across a small set of environments may well be GE if put into a larger scale of environments. Yan and Hunt (2001) demonstrated that G estimated from individual years was actually GE when compared across years. The GGE biplot interprets G as proportionate responses of genotypes to the environment (Yan et al., 2000). Although the patterns of genotypes can be separated into patterns attributable to G and GE, it is neither necessary nor beneficial from the viewpoint of cultivar evaluation.

In contrast to the genotypic patterns, the patterns regarding the environments in a GGE biplot are solely attributable to GE because E has been removed. If there is no GE in the GGE biplot, all environments would fall on a single point on the PC1-axis. Therefore, it is legitimate to discuss GE patterns based on a GGE biplot. With this understanding, the differentiation among environments in the GGE biplot should be consistent with that based on a GE biplot. To verify, the two megaenvironment classification based on the GGE biplot (Fig. 1) is also obvious on the GE biplot (Fig. 5). In the latter biplot, the western vs. eastern megaenvironment differentiation is clearly reflected on the interactive PC1. Although the interactive PC2 explains a sizable GE, no meaningful pattern can be found (Fig. 5). In general, the GE biplot is more powerful in environment classification than the GGE biplot because it displays more GE, although the GGE biplot is the single most informative biplot for both genotype and environment evaluation.



View larger version (19K):
[in this window]
[in a new window]
 
Fig. 5. GE biplot that displays only the GE of the genotype x environment table of yield. The 145 doubled haploid lines are represented by dots. See Fig. 1 for environment abbreviations.

 
Megaenvironment Classification vs. the Which-Won-Where Pattern
Dividing the target environments into meaningful megaenvironments and deploying different cultivars for different megaenvironments is the only way to utilize positive GE and avoid negative GE. A megaenvironment is defined as a group of locations that consistently share the same best cultivar(s) (Yan and Rajcan, 2002). This definition involves several essential elements. First, megaenvironments are defined by different winning cultivars. Second, subdivision of environments into groups on the basis of geographical positions represents a necessary condition for breeding of specific adaptations. Third, the cultivar-location interaction pattern should be repeatable across years. With this definition, the ‘which-won-where’ pattern is but one of the factors that must be considered in megaenvironment classification.

Figure 6 represents the which-won-where view of the same GGE biplot in Fig. 1. If which-won-where is used as the sole criterion, there would be three megaenvironments, defined by the wining genotypes 175 (five environments from SK and AB), 826 (three environments from ON), and 829 (17 environments from all locations), respectively. Obviously, this classification is meaningless because, on one hand, niches of line 175 and line 826 were actually part of the line 829 niche and, on the other hand, the line 829 niche covered all locations that were apparently different.



View larger version (23K):
[in this window]
[in a new window]
 
Fig. 6. The ‘which-won-where’ view of the same GGE biplot as Fig. 1. The doubled haploid lines other than those on the vertices are represented by dots. See Fig. 1 for environment abbreviations.

 
Conceptually, if the which-won-where pattern is used as the sole criterion, the identified megaenvironments would vary substantially when genotypes are changed or deleted from the data (Gauch and Zobel, 1997). Consequently, the development of a "universal" winner will result in merging several megaenvironments into a single one (Gauch and Zobel, 1997), whereas a single mega-environment will be split into multiple ones when more specifically adapted cultivars are developed. As a result, one may know how many megaenvironments there were in the target environment in the previous year but will never be sure how many there will be in the next year. Such a megaenvironment study will have limited use in guiding cultivar development and recommendation. For megaenvironment classification to be useful, it is important to have more or less stable megaenvironments.

The western vs. eastern megaenvironment classification is meaningful because it meets the following criteria. First, the apparent differentiation of the two environment clusters on the biplot (Fig. 1 or Fig. 6) meets the common principle of classification that variation between clusters is maximized and that within clusters minimized. Second, the classification is consistent with the geographical locations of the test environments. Third, the location grouping was relatively consistent across years. And, finally, the two megaenvironments did have different winning genotypes: lines 175 and 829 were the winning genotypes for the western megaenvironment, whereas lines 829 and 826 were the winning genotypes for the eastern megaenvironment. This example illustrates that a megaenvironment may have more than one winning genotypes and that even if there exists a universal winner, it is still possible, and beneficial, to divide the target environments into meaningful megaenvironments.

The Covariate-Effect Biplot as a Graphical Tool for Interpreting GE
A covariate-effect biplot was proposed in this paper so that the GGE patterns for yield can be interpreted using explanatory traits. This biplot allows visualizing the effects of each explanatory trait on yield in each of the environments. Eight traits (kernel weight, heading, lodging, maturity, powdery mildew susceptibility, protein content, test weight, and plumpness) had associations with yield in at least one environment, and their relations with yield in different environments explained 81% of the GGE pattern, implying that the GGE pattern can be effectively exploited by developing trait-selection strategies specific to each megaenvironment.

The explanatory traits were used as genetic covariables in our analysis. That is, the correlation coefficients are obtained on the basis of the genetic values of the explanatory traits. One may question whether this is legitimate when the explanatory traits may express GE themselves. The justification is that GE associated with each trait is largely removed when traits are averaged across environments. Traits that strongly interact with the environment would have little variation after being averaged across environments; as a result, their associations with the target trait in individual environments would be trivial. Such traits would not survive the relevance screening, or if they do, they would fall near the origin of the covariate-effect biplot. Thus, all explanatory traits that survived the relevance screening should have relatively small GE relative to G.

A disadvantage of using genetic values of the explanatory traits is that traits with large GE may be regarded as irrelevant even if they are important. This is likely to occur for traits whose interactions with the environment are parallel with the GE of the target trait. To amend this, a covariate-effect table could be generated from phenotypic values of the explanatory traits in each environment. A potential problem of this approach is that the table may contain many missing cells because it is common that not all traits are measured in all environments. Nevertheless, whenever possible, it is advisable to examine both types of covariate-effect tables.

The covariate-effect biplot approach should be applicable when the explanatory traits are replaced by other genetic covariables such as molecular markers and gene sequences. When explanatory traits are replaced by genetic markers, the covariate-effect biplot can be used to identify QTL for the target trait (Yan et al., 2005), and to investigate the QTL x environment interactions (Yan et al., 2005; Yan and Tinker, 2005). The linear correlation coefficients in the covariate-effect table can also be replaced by linear regression coefficients, with response variables used as dependent variables and explanatory variables as independent variables.

The covariate-effect biplot based on a trait x environment table of covariate effects should not be confused with the trait x environment biplot based on a trait x environment table of trait values (Lee et al., 2003). The latter biplot can be used to visualize the environmental correlations among traits.

With minor modifications, the covariate-effect biplot described here can also be used to interpret GE using environmental factors. In this case, the environmental factors are regarded as explanatory variables while environment-centered or standardized data of the genotypes for a target trait as response variables (Table 4). The centering or standardization is necessary to remove the environment main effects. On the basis of Table 4, an environmental factor x genotype two-way table of covariate effects can be constructed, which can then be visualized by means of a biplot (not shown).


    ACKNOWLEDGMENTS
 
We thank the Quaker Food Beverages Company and Quaker Tropicana Gatorade Canada for financial support of this research. We thank three anonymous reviewers for their insightful comments and useful suggestions.


    NOTES
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
1 The software GGEbiplot used in this paper is commercial software developed by Weikai Yan. Back

Received for publication February 6, 2004.


    REFERENCES
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 


Related articles in Crop Science:

THIS ISSUE IN CROP SCIENCE

Crop Science 2005 45: xiii. [Full Text]  



This article has been cited by other articles:


Home page
Agron. J.Home page
D. Baxevanos, C. Goulas, J. Rossi, and E. Braojos
Separation of Cotton Cultivar Testing Sites based on Representativeness and Discriminating Ability Using GGE Biplots
Agron. J., August 11, 2008; 100(5): 1230 - 1236.
[Abstract] [Full Text] [PDF]


Home page
Agron. J.Home page
S. L. Naeve, T. A. O'Neill, and J. E. Miller-Garvin
Canopy Nitrogen Reserves: Impact on Soybean Yield and Seed Quality Traits in Northern Latitudes
Agron. J., May 7, 2008; 100(3): 681 - 689.
[Abstract] [Full Text] [PDF]


Home page
Crop Sci.Home page
W. Putto, A. Patanothai, S. Jogloy, and G. Hoogenboom
Determination of Mega-Environments for Peanut Breeding Using the CSM-CROPGRO-Peanut Model
Crop Sci., May 1, 2008; 48(3): 973 - 982.
[Abstract] [Full Text] [PDF]


Home page
Agron. J.Home page
W. Yan and N. A. Tinker
DUDE: A User-Friendly Crop Information System
Agron. J., June 5, 2007; 99(4): 1029 - 1033.
[Abstract] [Full Text] [PDF]


Home page
Crop Sci.Home page
J. G. Robins, B. L. Waldron, K. P. Vogel, J. D. Berdahl, M. R. Haferkamp, K. B. Jensen, T. A. Jones, R. Mitchell, and B. K. Kindiger
Characterization of Testing Locations for Developing Cool-Season Grass Species
Crop Sci., May 31, 2007; 47(3): 1004 - 1012.
[Abstract] [Full Text] [PDF]


Home page
Crop Sci.Home page
W. Yan, M. S. Kang, B. Ma, S. Woods, and P. L. Cornelius
GGE Biplot vs. AMMI Analysis of Genotype-by-Environment Data
Crop Sci., March 1, 2007; 47(2): 643 - 653.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited