|
|
||||||||
a Biometrics and Statistics Unit, International Maize and Wheat Improvement Center (CIMMYT), Lisboa 27, Apdo. Postal 6-641, 06600 Mexico D.F., Mexico
b Dep. of Agronomy and Dep. of Statistics, Univ. of Kentucky, Lexington, KY 40546-0091
c Dep. of Plant Agriculture, Univ. of Guelph, Guelph, ON, Canada N1G2W1
* Corresponding author (j.crossa{at}cgiar.org)
| ABSTRACT |
|---|
|
|
|---|
Abbreviations: SHMM, Shifted Multiplicative Model SREG, Site Regressions Model GEI, genotype x environment interaction COI, crossover interaction LS, least squares SVD, singular value decomposition
| INTRODUCTION |
|---|
|
|
|---|
ij. = ß +
tk=1
k
ik
jk +
ij., and the sites regression model (SREG),
ij. = µj +
tk=1
k
ik
jk +
ij., where
ij. is the mean of the ith cultivar in the jth environment for g cultivars and e sites (i = 1,2,...,g and j = 1,2...e); ß is the shift parameter; µj is the site mean;
k(
1
2
...
t) are scaling constants (singular values) that allow the imposition of orthonormality constraints on the singular vectors for cultivars,
k = (
1k,...,
gk), and sites,
k = (
1k,...,
ek), such that
i
2ik =
j
2jk = 1 and
i
ik
ik' =
j
jk
jk' = 0 for k
k';
ik and
jk, for k = 1,2,3,..., are called primary, secondary, tertiary,..., effects of ith cultivar and the jth site, respectively;
ij. is the residual error assumed to be NID (0,
2/r) (where
2 is the pooled error variance and r is the number of replicates). The number of bilinear terms t
min(g, e). Estimates of the multiplicative parameters in the kth bilinear term are obtained as the kth component of the singular value decomposition (SVD) of the deviations from the additive part of the model. In the SHMM model, the bilinear terms absorb the environmental and genotypic main effects and the GEI, whereas in the SREG model, only the main effects of cultivars plus the GEI are absorbed into the bilinear terms.
If SHMM and SREG models with one multiplicative component (SHMM1 and SREG1) are adequate for fitting the data (second, third, and higher order multiplicative components are negligible) and primary effects of the sites,
j1, are either all non-positive or all non-negative, SHMM1 and SREG1 predict non-COI. On the contrary, if
j1 are of different signs, SHMM1 and SREG1 models predict COI. Moreover, the non-COI property of SHMM1 and SREG1 (when
j1 are either all non-positive or all non-negative) is a consequence of a proportionality condition, i.e., cultivar differences in any one site are proportional to their differences in any other site.
In various clustering studies based on SHMM or SREG (Cornelius et al., 1992, 1993; Crossa and Cornelius, 1997), the measure of distance (i.e., dissimilarity) between a pair of sites was the residual sum of squares (RSS) after fitting SHMM1 or SREG1, RSS(SHMM1) or RSS(SREG1), respectively. Seyedsadr and Cornelius (1993) proved that if e
g, RSS(SHMMe-1) = RSS(SREGe-1). Thus, for a subset of data containing only two sites, RSS(SHMM1) = RSS(SREG1). If the resulting
j1 have the same sign, RSS(SHMM1) is a non-COI solution; but if
j1 are of different signs, constrained SHMM1 and SREG1 solutions need to be computed. Crossa et al. (1993), in clustering sites into groups with non-COI, used constrained least squares (LS) SHMM1 solutions for pairs of sites needing constrained solutions, but Cornelius et al. (1993), in clustering cultivars into groups with non-COI, used constrained singular value decompositions (SVD) to obtain SHMM1 solutions. The constrained SVD solution will force only the most extreme primary effect of a site (located at left or right of the graph) to be zero, whereas the constrained LS solution will assign a value of zero to primary effects of as many sites as necessary to assure that site primary effects are either all non-negative or all non-positive.
Biplots are useful for summarizing and approximating patterns of response that exist in the original data (Gabriel, 1971, 1978). Yan et al. (2000) presented standard biplots of the SREG model that helped enhance its interpretation for selecting the best performing cultivars in subsets of sites. The authors proposed (i) connecting the markers of the farthest (most responsive) cultivars in the biplot such that they are the corners (vertices) of an irregular polygon and (ii) for each side of the polygon, drawing a line segment perpendicular to that side and passing through the origin. These line segments subdivide the polygon into sectors involving different subsets of sites and cultivars. The cultivar that is at the polygon corner located in one sector is the best performer (due to large positive GEI) in sites with markers included in that sector, but it is the worst performer (due to large negative GEI) in sites with markers located in the opposite sector of the biplot. The biplot from the SREG model shows that ideal cultivars should have large primary effects (high mean yield) and near-zero secondary effects (more stable) and the ideal sites should have large primary effects (high power to discriminate cultivars) and small secondary effects. Such properties tend to occur if the primary effects of cultivars are highly correlated with the cultivar means (Yan et al., 2001).
For SHMM2 and SREG2 models, the biplot of the first two multiplicative components would represent the graph of the interaction variation due to non-COI (first multiplicative term) (or proportionality of cultivar response in sites) versus the interaction variation due to COI (second multiplicative term) (or disproportionality of cultivar response in sites). This is accomplished if, and only if, the scores of the first singular vector for sites,
j1, are all of the same sign. If
j1 are of different signs, a constrained solution for SHMM2 and SREG2 is required, such that the first multiplicative term should show a non-COI pattern. For SHMM2, this is simply obtained by constraining the first multiplicative term by the standard constrained SVD solution and using the second multiplicative component of its SVD as the second multiplicative term. For the SREG model, the SVD non-COI solution is not that simple.
Previous research using the SHMM and SREG models led to the development of clustering procedures for finding subsets of sites with non-COI or subsets of cultivars with non-COI (Cornelius et al., 1992, 1993; Crossa et al., 1993, 1995, 1996; Crossa and Cornelius, 1993, 1997). However, these procedures do not simultaneously identify non-COI subsets of cultivar and sites. The main purposes of this study were to investigate (i) SREG2 and SHMM2 biplots with the first multiplicative components constrained to be non-COI SREG1 and SHMM1 solutions and to compare these with the biplot of Yan et al. (2000) in cases where the unconstrained solution does not yield a non-COI solution, (ii) how the biplots can be used for identifying subsets of sites and cultivars with different levels of COI and with non-COI, and (iii) how these biplots compare with results obtained when clustering only sites or cultivars without cultivar rank change.
| MATERIALS AND METHODS |
|---|
|
|
|---|
|
|
ij.) and on the basis of the scaled data (
ij.). The scaled data were computed as in Crossa and Cornelius (1997), that is,
, where s2j is the error variance in the jth site and r is the number of replicates. The notation used for the SREG2 and SHMM2 analyses and the corresponding biplots using unscaled and scaled data and unconstrained and constrained solutions is as follows: the first capital letter in parenthesis denotes the type of data used (U = unscaled, S = scaled) and the second capital letter represents the type of solution applied (U = unconstrained; CSVD = constrained SVD; CLS+1 = constrained LS SREG1 as the first term and with the second term taken as the first component of the SVD of residuals from the SREG1 constrained LS solution). Thus, SREG2(U/U) denotes the sites regression model on unscaled data and applying an unconstrained solution; SREG2(U/CSVD) denotes the sites regression model on unscaled data and applying a constrained SVD solution; SREG2(S/U) denotes the sites regression model on scaled data and using an unconstrained solution; SREG2(S/CSVD) denotes the sites regression model on scaled data and using a constrained SVD solution; SREG2(S/CLS+1) denotes the sites regression model on scaled data and using a constrained LS solution. Similar notation is used for biplots of the SHMM2 model where only constrained SVD solutions are computed.
Constrained SVD Non-COI Solution for the SHMM Model
The constrained SVD non-COI solution of the SHMM model for two or more cultivars is described in Cornelius et al. (1993), and for two or more sites, they are given in Crossa et al. (1995)(1996). Briefly, the matrix subjected to SVD is Z =
=
. The residual sum of squares for SHMM1 is RSS(SHMM1) = trace(Z'Z) - L1, where L1 is the largest eigenvalue of Z'Z. In this case, the value of
is selected such that the smallest or the largest
j1 is zero. For a pair of sites, 1 and 2, and g cultivars, the closed-form SVD non-COI SHMM1 solutions for
are given by
![]() |
![]() |
Both solutions force Z'Z to be diagonal. One solution has
11 = 0 and
21 = 1 and the other
11 = 1 and
21 = 0. For either value of
, RSS(SHMM1) = trace(Z'Z) - L1 reduces to the sum of squared deviations of the cultivar means in the site with
j1 so that the solution is the one that gives the minimum RSS(SHMM1) = min[
i
2,
i
2].
For more than two sites, the SVD non-COI SHMM1 solution does not exist in closed form (Cornelius et al., 1993). If site m is selected to have
m1 = 0, the constraint is
1
m1 =
i
i1
= 0 for which
![]() |
i1 is estimated by the SVD of Z. (Note that here Z also changes iteratively.)
Constrained SVD Non-COI Solution for the SREG Model
For the constrained SVD non-COI solution for the SREG model, a solution is required such that Z =
=
has elements of its first right singular vector all of the same sign (or zero). The proposed solution to this problem is to put
j =
.j. +
and choose
to satisfy the required condition. Note that, after shifting the
j values (from
.j.), the
j should no longer be perceived as estimates of site means.
If a constrained solution is needed, the
j1 values in the unconstrained solution will contain both positive and negative values. Let S- and S+ denote the sum of squares of the negative and positive
j1 values, respectively. If S- < S+, choose the site with the most negative
j1 value to be the site to have its
j1 value forced to zero in the constrained solution. Conversely, if S- > S+, choose the site with the largest positive
j1 value to have its
j1 value forced to zero. Suppose these rules lead to a Site m as the site so chosen, i.e.,
will be chosen to force
m1 = 0. Solutions for
are
![]() | [1] |
Derivation of this result is given in Appendix 1. Iterate until the value of
converges, consistently choosing either the positive or negative solution on every iteration. At convergence, the quantity under the radical in Eq. [1] is necessarily positive. In practice, to ensure that the iteration actually gets started, replace the quantity under the radical with its absolute value if it is negative.
Typically, the negative solution for
will make most of the
i1 values of the same sign as the nonzero
j1 values. The positive solution for
will have the opposite effect. Absolute values of the
j1, and also of the
j2, will be the same for either solution, but this will not hold for the
i1 or for the
i2. The singular values (
) and sequential sum of squares will be the same for either solution. Predicted values (
ij) will differ, but cultivar differences within any particular site will be the same under either solution.
Constrained LS+1 Non-COI Solution for the SREG2 Model
If the
j1 are to be forced to zero for e1 sites in set S1 and left unconstrained in the complementary set S2 consisting of e2 = e - e1 sites, the residual sum of squares is
![]() |
(Crossa and Cornelius, 1997). Both terms in this expression are minimized by putting
j =
.j, with
1,
i1, and
j1 for sites in set S2 obtained as the first component of the SVD of the g x e2 matrix of deviations of cell means from site means,
ij. -
.j, in set S2.
In practice, one makes a first choice of a site, which we will denote as Site k, to have its
j1 forced to zero, i.e., as a first choice for a site to be put into set S1. This choice will be made as described for choosing Site k for the constrained SVD non-COI solution. If the
j1 for the e - 1 sites remaining in set S2 includes values differing in sign, choose a second site to have its
j1 forced to zero. Continue this process until the SVD of set S2 gives the remaining nonzero
j1 all having the same sign.
The fitted non-COI SREGLS+1 model is obtained by means of the non-COI LS SREG1 solution as the first multiplicative term and then extracting a second multiplicative term as the first component of the SVD of the matrix of deviations of the cell means from the non-COI LS SREG1 solution. Vectors of
j1 and
j2 values appear to be orthogonal to one another, but this does not hold for vectors of
i1 and
i2 values.
SREG Model Using Mandel's Solution
The biplot obtained from the SREG model with Mandel's solution has been recently suggested by Yan et al. (2001) and consists in plotting, as primary effect, Mandel's solution for site regression and the first principal component extracted from the regression deviations as the secondary effect (SREGM+1). The SREGM+1 model is
ij. = µj + bjgi +
1
i1
j1 +
ij. where bj is the regression coefficient of the jth site on the cultivar main effects (gi) and the other terms defined as in previous cases. This equation is Mandel's sites regression (
ij. = µj + bjgi +
ij.), plus one additional multiplicative term (
1
i1
j1) estimated by subjecting the matrix of deviations from the Mandel's regression model (
ij. - µj - bjgi) to SVD.
Biplot
Biplots obtained from linear-bilinear models, such as SHMM and SREG, are constructed from the SVD of the two-way table of deviations of empirical cell means from least squares estimates of the additive components. On a two-dimensional Cartesian coordinate system, markers for cultivars are plotted with primary effect (score in first multiplicative term) and secondary effect (score in second multiplicative term) as coordinates. A set of markers for sites is plotted on the same figure, also with primary and secondary effects as coordinates.
A full description of the interpretation of the biplots of multiplicative models is given in Gower and Hand (1996). Briefly, the cultivar and site scores are represented as vectors in a two-dimensional space, so it is useful to interpret biplots in terms of directions of the vectors and their projections. Cultivar and site vectors are defined as vectors from the origin (0,0) to the end points determined by their markers (scores). An angle
< 90° or
> 270° between a cultivar vector and a site vector indicates that the cultivar had a positive response at that site. A negative cultivar response is indicated if 90° <
< 270°. Note that in the SREG model, the interpretation of the biplot is with respect to the variation for which main effects of cultivars (G) and the GEI (G+GEI) account, whereas in the SHMM biplot, the interpretation is on the deviations from the shift parameter. Performance of a cultivar in a site can be approximated by the orthogonal projection of the cultivar vector onto the line determined by the direction of the site vector; that is, if we consider the line containing the site vector, the cultivar's response at that site is approximated by the length of the segment of that line extending from the origin to the point where that line can be perpendicularly intersected by a line drawn from the cultivar marker.
The cosine of the angle between two site (or cultivar) vectors approximates the phenotypic correlation of yield performance of the two sites (or cultivars). An angle of zero indicates a correlation of +1; an angle of 90° (or -90°), a correlation of 0; and an angle of 180°, a correlation of -1. Furthermore, the cultivar scores for the first multiplicative component of the SREG model will usually be closely associated with the cultivar main effects.
The biplot methodology of Yan et al. (2000) forms a polygon by joining the most extreme cultivars of the biplot with line segments, one for each side of the polygon drawn from the origin to perpendicularly intersect that side of the polygon. These perpendiculars are further extended sufficiently far to subdivide the biplot into sectors so that each site marker and each cultivar marker is contained within one (and only one) sector. When a polygon cannot be formed because primary effects of cultivars, as well as primary effects of sites, are all of the same sign, but the signs for cultivars are opposite to those for sites, one can still draw straight lines joining the most extreme cultivars to form a polygon, as well as lines that pass through the origin and are perpendicular to the sides of the polygon. In many cases, perpendicular lines from the center of the biplot are drawn, but their intersection falls on the extension of the side of the polygon beyond the corner (vertex) where the side ends.
Rescaling the Singular Vectors
For the graphical display of the biplots, it is advisable to absorb the singular values of the first and second multiplicative components,
1 and
2, into the singular vectors of sites (
1 and
2) and cultivars (
1 and
2) in such a way that the products of rescaled primary and secondary effects are equal to the contributions
1
i1
j1 and
2
i2
j2 of the first and second multiplicative components, respectively, to the predicted values of the attribute.
Let the rescaled values of the singular vectors of cultivars and sites be
*1 =
A1
1 and
*1 = 
1
1, respectively, with 0
A
1. Select a value of A, such that it will force the range of values in the singular vector for sites to be equal to the range of values in the singular vector for cultivars, that is, max(
*1) - min(
*1) = max(
*1) - min(
*1). Define C = max(
1) - min(
1) and D = max(
1) - min(
1) (where max and min denote the largest and smallest elements of the vector, respectively). Then choose A such that C(
A1) = D(
1-A1) or equivalently,
2A-11 = D/C. Solving for A gives (2A - 1) log(
1) = log(D/C) and then
![]() |
![]() |
If the ranges are the same, i.e., D = C, then A = (1 - A) = 0.5. Value for A for rescaling the vectors in the second multiplicative term is computed similarly.
Defining Levels of COI
It is useful to classify subsets of sites and cultivars with different levels of COI, defined in terms of how much rank displacement has occurred in the COI. An order h-1 adjacent COI subset will be defined as a subset of e1 sites and a subset of h cultivars such that the ranks of this subset of cultivars in the ranking of all cultivars in each of the sites in the subset is some permutation of the integers r+1, r+2,..., r+h, with these permutations not being the same at all sites in the subset, but with r being a constant integer, 0 < r
g - h. The level of the adjacent COI subset will be defined as the maximum cultivar rank change that occurs from one site to another in the subset. Our main interest will be the cases where r = 0 and h = 2 (or 3), which constitute cases where the best two (or three) cultivars are the same two (or three) cultivars in every site in the subset. In other words, we will be interested in order 1 adjacent COI and order 2 adjacent COI.
For our purposes in this paper, an adjacent subset will be considered low level if its order is
2. Thus, since the level of an adjacent COI subset cannot exceed its order, cases where h = 2 or 3 are necessarily low level. In the sequel, for brevity we will drop the word adjacent, and simply characterize such subsets as low level COI subsets. Note that a level 0 adjacent COI subset cannot exist, because there must be at least two cultivars with rank changes if more than one permutation of the subset of ranks exists. Thus, only a non-COI subset can be at level 0 with respect to cultivar rank changes. In our usage, for a subset of cultivars to be adjacent, the members of the subset must not only be "adjacent" in every site, but they also must be consistently adjacent, i.e., r must be constant. When r is not constant but differs from site to site, the subset of cultivars is inconsistently adjacent in the subset of sites. (An extreme case would be when three cultivars are the three best in some sites and the three worst in other sites.)
Software
Unconstrained and constrained SVD non-COI SREG2 and SHMM2 solutions can be computed by the FORTRAN program EIGAOV that can be run on a personal computer. The constrained LS+1 solution for SREG was obtained by importing the constrained non-COI LS SREG1 solution from EIGAOV into SAS/IML (SAS Institute, Inc., 1989) to complete the computation. Information about the use of the EIGAOV programs can be obtained from the second author.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
Rankings of the cultivar predicted values in each site for SHMM2 and SREG2 models based on scaled and unscaled data, and for the different non-COI constrained solutions are presented in Tables A1 and A2 (Appendix 2).
Unconstrained and Constrained SREG2 Solutions and Their Biplots
For the unscaled data and unconstrained model, SREG2(U/U), the FR test of Cornelius et al. (1992), which assesses the significance of the residual variation after fitting the first k - 1 multiplicative components, found no significant residual (P
0.05) after fitting the second multiplicative component, whereas the FGH1 test (Cornelius et al., 1996) used for judging the significance of sequentially fitted multiplicative terms found three significant terms (P
0.05). For SREG2(U/CSVD), three and four significant terms were found significant (P
0.05) by the FGH1 and FR tests, respectively. For the scaled data and unconstrained model, SREG2(S/U), three and four terms were significant (P
0.05) by the FGH1 and FR tests, respectively, and for SREG2(S/CSVD), both tests found four significant terms (P
0.05).
The biplot of the SREG2 model, using unscaled data, SREG2(U/U) (Fig. 1A)
shows that cultivars, based on the sign of their primary effects (
i1), are divided into two groups, {G1, G2, G3, G7, G8} vs. {G4, G5, G6, G9}. Sites, based on the sign of their primary effects (
j1), are divided into two groups {1, 3, 8, 10} vs. {2, 4, 5, 6, 7, 9, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20}. Cultivars G1, G2, G3, G7, and G8 have a positive response in terms of their primary effects and GEI at Site 8 (their orthogonal projections onto the line containing the site vector are in the same direction as the site vector) as opposed to the projection of Cultivars G4, G5, G6, and G9 that are located in the opposite side.
|
i1) and stability (
i2) and site evaluation with respect to discrimination (
j1) and representativeness (
j2) (Yan et al., 2001). In Fig. 1A, genotype G4 has the largest primary effect (high mean yield) and near-zero secondary effect (stable across most sites), whereas Sites 1 and 3 do not discriminate cultivars well (relatively small
j1), and relatively large
j2 values predict large GEI, that is, inconsistency of cultivar responses in Sites 1 and 3 as compared with their responses in other sites. The polygon has seven vertices located at markers for Cultivars G1, G2, G7, G9, G4, G5, and G8. In the upper right sector of Fig. 1A, Cultivar G4 had the best SREG2 predicted values at Sites 2, 4 through 6, 9, 11 through 14, 16, and 18 (Table 3), but it ranked among the worst three cultivars (7th, 8th, and 9th ranks) in sites located in the opposite sectors containing Sites 1, 3, 8, and 10 (Table 3). On the contrary, Cultivar G8 is the winner in Sites 1, 3, and 10 and the loser in sites located in the opposite sector, Sites 2, 4 through 6, 9, 11 through 14, 16, and 18. Thus, Cultivars G4 vs. G8 and Sites 1, 3, 8 and 10 (with negative primary effects) vs. Sites 2, 4 through 6, 9, 11 through 14, 16, and 18 (with positive primary effects and located in opposite sector of the biplot) had a clear COI pattern. Cultivars G1 and G5 are the winner (1st rank) and loser (9th rank) in Site 8, respectively (Table 3). Similar patterns of COI can be observed for the cultivar subset {G1 and G2} vs. G5 at Sites 7, 15, 17, 19, and 20 (with positive primary effects) as compared to Site 8 (with negative primary effects). The Site 8 marker was located far away from the other site markers so that it can be considered very different from the other sites.
|
|
The biplot of the constrained SREG2 model using unscaled data and SVD non-COI constrained solution, SREG2(U/CSVD) (Fig. 1B), showed Sites 1, 3, and 10 with
j1 > 0 and Site 8 with
j1 = 0. Similar to the SREG2(U/U) model, two groups of cultivars are formed {G1, G2, G3, G7, G8} and {G4, G5, G6, G9}. Constraint of the first term of SREG2 gave all primary effects for cultivars with negative values and all primary effects for sites with non-negative values (zero for Site 8). The lower dispersion of the points in this biplot reflects the lower variability explained by the constrained solution as compared with that obtained by the unconstrained solution. Although a polygon that contains the plot origin (0, 0) cannot be drawn, the properties of the biplot remained the same as those given for the biplot obtained with the unconstrained solution. Figure 1B predicts COI for Cultivars G1 and G4 at Sites 1, 3, and 10 as compared to Sites 4, 5, 11 through 13, 15, and 18 (Table 3). Similarly, COI is found between Cultivars G6 and G8 in Sites 2, 6, 7, 9, 14, and 16 through 20 as compared with Site 8. The Site 8 marker is far away from the others in the biplot; the constrained solution sets its primary effects equal to zero, and 50% of the variation explained by the second multiplicative component is due to cultivar differences within Site 8. The line perpendicular to the segment joining Cultivars G1 and G6 separates the two non-COI groups of sites and cultivars. One low order COI subset comprises Cultivars G4, G5, and G6 and all sites except Sites 1, 3, 8, and 10 (note that Cultivar G9 ranked fourth in most of these sites) (Table 4). A non-COI group includes Cultivars G1 and G3 and Sites 1, 3, and 10 (Table 4).
For the unconstrained SREG2 model using scaled data, SREG2(S/U), the biplot (Fig. 1C) gave results similar to those found for SREG2(U/U). In the right sector, Cultivar G4 had the best SREG2 predicted values at Sites 2, 4 through 7, 9, 11 through 16, 18, and 19 (Table 3), but it ranked among the worst three cultivars (7th, 8th, and 9th ranks) in sites located in the opposite sectors, Sites 1, 3, 8, and 10 (Table 3). On the contrary, Cultivar G8 is the winner in Sites 1, 3, and 10 and the loser in sites located in the opposite sector, Sites 2, 4 through 6, 9, 11 through 16, and 18. Thus, Cultivars G4 and G8 show COI at Sites 2, 4 through 6, 9, 11 through 16, and 18 (with positive primary effects) as compared to Sites 1, 3, and 10 (with negative primary effects). On the other hand, Cultivars G4, G5, and G6 and Sites 2, 4, 5, 7, 9, and 11 through 20 represent a low level COI group (Table 4); these cultivars with Sites 3, 8, and 10 formed a non-COI group but in the negative direction (poor yield response). Also, Cultivars G1, G2, and G8 and Sites 3 and 10 formed a non-COI group.
The biplot of the SREG2(S/CSVD) (Fig. 1D) is similar to the SREG2(U/CSVD) biplot. Cultivars G1 vs. G4 and Sites 2, 4, 5, 11 through 13, 15, 16, and 18 vs. Sites 1, 3, 8, and 10 formed a COI group (Table 3). Cultivars G4, G5, and G6 were the three best ranking cultivars in all sites except Sites 1, 3, 8, and 10, and thus formed a low level COI subset. Cultivars G1, G3, and G6 in Sites 1 and 3 and also Cultivars G1 and G8 in Sites 8 and 10 (Table 4) formed a non-COI group. Site 8 is very distinct from the others and explained 51% of the second term variability (data not shown). Biplot of the constrained non-COI solution showed less dispersion of points but interpretation similar to the biplots obtained from unconstrained solutions.
Constrained LS+1 SREG2 Solution and its Biplot.
In the biplot of the LS+1 constrained SREG2 model using unscaled data, SREG2(U/CLS+1) (Fig. 2A)
produced a polygon that is a triangle in which Cultivars G1 and G4 have a COI in Sites 3, 8, and 10 as compared to the rest of the sites (Table 3). Most of the variation described by the second component (80%) is due to cultivar differences within Site 8. Sites 1, 3, and 10 are located toward the center of the biplot and thus cultivar differences at those sites are small. Cultivars, based on the sign of
i1, are divided into two groups, {G1, G2, G3, G7, G8} vs. {G4, G5, G6, G9}. Sites 1, 3, 8, and 10 have
j1 = 0. Cultivars G4, G5, G6, and G9 in all sites, except Sites 3, 8, and 10, formed a clear low level COI subset and Cultivars G1, G2, G3, G7, and G8 in Sites 3, 8, and 10 (Table 4) formed a non-COI subset. Very similar COI subsets of cultivars and sites are found for the biplot on the scaled data, SREG2(S/CLS+1) (Fig. 2B), except that now Cultivar 8 ranked fourth in Sites 1, 3, 8, and 10 and that for Site 1 the best three cultivars were G1, G2, and G3.
|
SREG Model Using Mandel's Solution and its Biplot.
Recently, Yan et al. (2001) showed that the biplots from the SREG model using the Mandel solution (SREGM+1) and the standard SREG2 model gave similar winning cultivars as well as GEI interaction patterns. The advantage of the SREGM+1 biplot is that the first component indicates mean yield and the second component stability; for the SREG2 model, this is so only if the first bilinear component is highly correlated with the cultivar main effects.
The biplot of the SREGM+1 model using unscaled data, SREGM+1(U) (Fig. 2C) showed the same split of cultivars and sites that was previously found, that is, {G1, G2, G3, G7, G8} vs. {G4, G5, G6, G9} and sites {1, 3, 8, 10} vs. the rest. A polygon with six vertices G1, G5, G6, G4, G9, and G8 (counter-clockwise around the polygon) is formed with Cultivars G4 and G8 at opposite sectors having COI in Sites 1, 8, and 10 as compared with Sites 2, 5, 11, 12, 14, and 18 (Table 3). Cultivars G4, G5 and G6 had the best three predicted values in Sites 4, 6, 7, 9, 13 through 17, 19, and 20 (located toward the lower right quadrant of the biplot) and thus formed a low level COI group (Table 4). Cultivars G1 and G8 are the best two in Sites 1, 8, and 10 and formed a non-COI group (Table 4 and Appendix 2 Table A1). Sites 2, 5, 11, 12, and 18 had very similar cultivar ranking and gave Cultivars G4, G6 and G9 as the best three performers and thus formed a low level COI group.
The biplot of the SREGM+1 model using scaled data SREGM+1(S) (Fig. 2D) showed a COI pattern between Cultivars G8 and G6 in Sites 1 and 10 as compared to Sites 2, 4, 6, 7, 9, 11 through 16, 18 (Table 3). Similar to the SREGM+1(U) case, Cultivars G4, G5, and G6 had the best three predicted values in Sites 2, 4 through 7, 9, 11 through 20 and, thus, formed a clear low level COI group (Table 4).
Unconstrained and Constrained SHMM2 and Their Biplots
For the SHMM2(U/U) and SHMM2(U/CSVD), the FR and FGH1 tests found that the first three multiplicative components were significant (P
0.05). For SHMM2-(S/U), the FR and FGH1 tests found that the first three and four multiplicative components were significant (P
0.05), respectively. Since all primary effects of sites for SHMM2(S/U) model were of the same sign, biplots with non-COI constrained SHMM2 solutions for this model were not required. The biplot of the SHMM2-(U/U) (Fig. 3A)
model had a subset of sites {1, 7, 8, 10, and 19} with negative values of site primary effects, while the rest of the sites had positive values. All cultivars had positive and high values for their primary effects. The secondary effects of cultivars separated them into two groups {G1, G2, G3, G7, and G8} vs. {G4, G5, G6, and G9}. This subdivision of cultivars was also obtained for SHMM2(U/CSVD) (Fig. 3B), SHMM2(S/U) (Fig. 3C), and SHMM2(U/CLS+1) (Fig. 3D).
|
Cultivar G1 is chosen by SHMM2(U/CSVD), SHMM2-(S/U), and SHMM2(U/CLS+1) as best in Sites 1, 3, 8, and 10 and Cultivars G1, G2, and G3 are found as a low level COI group in Sites 1, 3, and 10 by SHMM2(U/CSVD), in Sites 3 and 10 by SHMM2(S/U), and in Sites 1 and 3 by SHMM2(U/CLS+1). Also, all the SHMM2 biplots indicated that Site 8 is very different from the others. It is apparent that the constrained SHMM2 solutions do not affect the interpretability of the biplots for finding COI and non-COI groups of cultivars and sites.
Clustering of Sites or Cultivars into Groups with Non-COI
It is useful to investigate the clustering of sites (or cultivars) into non-COI subsets. This was done by means of the SREG1 model and the clustering strategy proposed by Crossa et al. (1993) for grouping sites, and the fusion method of Crossa and Cornelius (1993) for clustering cultivars.
Recently, Cornelius et al. (1996) and Cornelius and Crossa (1999), in a cross-validation study involving five multienvironment cultivar trials, found that shrinkage estimates of multiplicative models were usually more accurate for predicting the response of cultivars within sites that were best truncated multiplicative models fitted by least squares, best linear unbiased predictors (BLUPs) based on a two-way random effects model with interaction, and the empirical cell means. For Trial 1 (Trial 3 in Cornelius and Crossa, 1999), the shrinkage estimates of multiplicative models were better predictors than BLUPs and empirical cell means. Consequently, clustering of sites (or cultivars) in Trial 1 into non-COI groups was also computed by means of distance between sites computed with the empirical cell (cultivar x site) means replaced by SREG shrinkage estimates as input data.
Dendrograms and final groups of sites and cultivars based on empirical cell means are shown in Fig. 4 and dendrogram and final groups of sites based on SREG shrinkage estimates of cell means are shown in Fig. 5 . In both cases, sites are grouped into two major clusters {1, 3, 8, 10} vs. {2, 4, 5, 6, 7, 9, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20}; cultivars are split into two main subsets {G1, G2, G3, G7, G8} vs. {G4, G5, G6, G9}. This separation of the sites and cultivars into two main groups was consistently found in all model-data-constraint combinations previously described. The advantage of the biplots, however, is that sites and cultivars can be simultaneously clustered into subsets with non-COI.
|
|
Trial 2
Unconstrained and Constrained SHMM2 and SREG2 Solutions and Their Biplots. Tests of the statistical significance of the multiplicative terms (Cornelius et al., 1992) needed to describe the variation in the Trial 2 data showed that two multiplicative components were significant (P < 0.05) for SREG and SHMM. The SREG2(U/U) biplot (Fig. 6A)
is divided into five sectors by five winning cultivars: cultivar G2 was the predicted winner in four environments (7, 13, 20, and 23); G3 won in seven environments (3, 5, 8, 12, 18, 19, and 25); G5 won in a single environment, 15; G9 won in 10 environments (4, 6, 10, 11, 14, 16, 17, 21, 22, and 26); and G10 won in four environments (1, 2, 9, and 24). Cultivar G9 won in more environments than any other cultivar and had the highest mean yield. Cultivar G3 won in the second highest number of environments and had second highest mean yield. COI pattern is evident from opposite sectors of Fig. 6A. For example, G10 is the predicted winner at Sites 1, 2, 9, and 24 and the predicted loser at Sites 7, 20, and 23; also G2 is the worst cultivar at Sites 1, 2, and 24 and the winner at Sites 7, 20, and 23; the observed values (Table 2) confirmed these approximations. A similar COI pattern can be found between the sector where cultivar G5 is the winner versus the sector where G9 is the winner.
|
j1 < 0 on the unconstrained solution are now forced to have
j1 = 0. No solution was obtained for the SVD non-COI constrained SREG2, probably because as many as six environments had primary effects with a different sign for the primary effect than did the complementary subset consisting of 20 environments. The SHMM2(U/U) biplot (Fig. 6C) showed that all cultivars and sites have primary effects of the same sign, reflecting a complete non-COI, and that only G9 and G3 are winners, whereas the SREG2(U/U) biplot showed these as winners in only 17 of the 26 sites. The discrepancy is probably due to greater power of SREG2 to detect COI. Results of the clustering of cultivars into groups with non-COI showed two main groups {G1, G11, G4, G6, G7, G2, G9} vs. {G3, G10, G8, G5} (dendrogram not shown). These two groups are clearly separated in the three biplots (Fig. 6A6C). The sites are clustered into three major groups {1, 24, 9, 3, 25, 8, 12, 5, 19}, {4, 6, 11, 14, 17, 26, 7, 13} and {10, 20, 23, 18, 21} and Sites 2, 16, 15, and 22 are left unclustered. The first of these site groups tended to cluster in the lower left quadrant of the SREG2(U/U) biplot (Fig. 6A), whereas the latter two groups are located toward the lower right and upper right quadrants.
In summary, the wheat data set confirmed the findings from the maize data that both SREG2 and SHMM2 biplots can be used to identify subsets of cultivars and sites with COI and non-COI. Since the SREG2 focuses on and explains more of the cultivar main effect and the GEI, which are the sources of yield variation that are relevant to cultivar evaluation and cultivar performance based on megaenvironment identification, the SREG2 biplot gives good discrimination and resolution of the cultivars and the sites. This is consistent with the conclusion of Crossa and Cornelius (1997) when comparing SREG1 with SHMM1 in studying COI.
| CONCLUSIONS |
|---|
|
|
|---|
j1, are of the same sign. The biplots obtained using the constrained non-COI first term solutions for the SREG2 and SHMM2 models have the same interpretability properties as the standard biplots obtained using the unconstrained solution and give a good approximation to the patterns existing in the observed data. However, the biplot based on the unconstrained solution explains more variation and, therefore, has greater power to separate both cultivars and sites. With the constrained solution, it is possible to identify subsets of sites and cultivars with low level COI and non-COI.
Results of this study indicate that the biplots of the SREG2 and SHMM2 models are useful for identifying subsets of sites and cultivars with COI, low level COI, and non-COI. In general, biplots based on unscaled or scaled data gave rise to similar results. Groups of sites and cultivars with low level COI and non-COI were similar to those found when only sites (or cultivars) were clustered into non-COI groups using the SHMM and SREG clustering approach. This result confirms the benefits of using the biplots for finding simultaneous subsets of sites and cultivars with COI, low-level COI, and non-COI for breeding and agronomic purposes.
| APPENDIX 1 |
|---|
|
|
|---|
such that
![]() | [A1] |
im. -
.m. Evidently, this should lead to
![]() |
= 0), then
![]() |
i1 values will allow an iterative solution to be computed. But
![]() | [A2] |
Substitution of [A2] into [A1] gives
![]() | [A3] |
![]() |
![]() | [A4] |
Putting
m1 = 0 in [A4] gives the solution shown as Eq. [1] in the text.
| APPENDIX 2 |
|---|
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||