Crop Science Journal of Natural Resources and Life Sciences Education
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (3)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Moreno-González, J.
Right arrow Articles by Cornelius, P. L.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Moreno-González, J.
Right arrow Articles by Cornelius, P. L.
Agricola
Right arrow Articles by Moreno-González, J.
Right arrow Articles by Cornelius, P. L.
Related Collections
Right arrow Biometrics
Published in Crop Sci. 43:1967-1975 (2003).
© 2003 Crop Science Society of America
677 S. Segoe Rd., Madison, WI 53711 USA

CROP BREEDING, GENETICS & CYTOLOGY

Additive Main Effects and Multiplicative Interaction Model

I. Theory on Variance Components for Predicting Cell Means

J. Moreno-Gonzáleza, J. Crossa*,b and P. L. Corneliusc

a Centro de Investigaciones Agrarias de Mabegondo, Apartado 10, A Coruña, Spain
b Biometrics and Statistics Unit, International Maize and Wheat Improvement Center (CIMMYT), Apdo. Postal 6-641, 06600 Mexico DF, Mexico
c Dep. of Agronomy and Dep. of Statistics, Univ. of Kentucky, Lexington, KY 40546-0091

* Corresponding author (j.crossa{at}cgiar.org)


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 CONCLUSIONS
 REFERENCES
 
Many studies have shown the practical advantages of applying the additive main effects and multiplicative interaction (AMMI) model to multienvironment trials; however, a theory about the contributions of error and genotype x environment interaction (GEI) variance components to interaction principal components (PCs) in AMMI models is needed. The objectives of this work were to (i) develop an eigenvalue partition (EVP) method for separating variation attributable to each AMMI interaction PC into interaction variance and error variance components: (ii) develop root mean square predictive difference (RMSPD) on the basis of the EVP theory (RMSPEVP), for selecting the best truncated AMMI model; (iii) apply the RMSPDEVP criterion to three multienvironment cultivar trials and to simulation data generated for selecting the best truncated AMMI model; and (iv) validate the EVP method by comparing results of the RMSPDEVP criterion with those obtained with the criterion conventionally used to choose a truncated AMMI model by cross validation (RMSPDCV). A data resampling method was devised to estimate the contribution of error variance to the eigenvalues. The coefficients of the structural GEI variance component were always larger than those of the error variance component for the earlier PC axes. As the error associated with the cell means decreased and the number of replications increased, the portion of the cumulative GEI explained by the earlier AMMI PC axes generally increased, whereas the portion of the error sum of squares (SS) explained by the earlier AMMI PC axes decreased. The RMSPDEVP and RMSPDCV methods selected similar truncated AMMI models. The RMSPDEVP criterion is useful for selecting the best truncated AMMI models with the advantage that it can be applied to all trial replications.

Abbreviations: AMMI, additive main effect and multiplicative interaction • BLUP, best linear unbiased predictor • COMM, completely multiplicative model • EVP, eigenvalue partition • GEAR, genotypes, environment, attribute regression model • GREG, genotypes regression model • GEI, genotype x environment interaction • PC, principal component • SREG, sites regression model • SHMM, shifted multiplicative model • RMSPD, root mean square predictive difference


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 CONCLUSIONS
 REFERENCES
 
IN PLANT BREDDING, genotypes are evaluated in multienvironment trials to test their performance across environments and to select the best genotypes in specific environments. Variance due to GEI is an important component of the variance of phenotypic means in selection experiments (Hallauer and Miranda, 1983). Reduction of the GEI and error variances associated with phenotypic means will increase the heritability of the trait under selection. Several strategies have been proposed to increase precision in estimating cell (genotype–environment combinations) means. One strategy is based on a model that combines genotype, environment, and attributes in regression models (GEAR), which reduces the GEI and error variances of predicted cells (Moreno-González and Crossa, 1998). Other models are the additive main effects and multiplicative interaction model (AMMI), the sites regression model (SREG), the genotypes regression model (GREG), the completely multiplicative model (COMM), and the shifted multiplicative model (SHMM) (Cornelius et al., 1996).

The AMMI model has been extensively applied in the statistical analysis of multienvironment cultivar trials (Kempton, 1984; Gauch, 1988; Gauch and Zobel, 1989; Crossa et al., 1990; Gauch and Zobel, 1997). Its main feature is that it partitions the GEI source of variation into variation due to principal component (PC) interaction parameters. The AMMI models with one, two, three, and subsequent multiplicative PC components are named truncated AMMI1, AMMI2, AMMI3, etc., respectively. The AMMI0 model includes the main effects without interaction.

Criteria for determining the optimal number of multiplicative terms to be retained in the multiplicative models include sequential tests of the null hypothesis that determine what multiplicative terms are negligible, and random splitting of the data and cross validation procedure. Shrinkage estimators have been suggested as an alternative to model truncation (Cornelius et al., 1996; Cornelius and Crossa, 1999). The most commonly used test for determining the number of significant multiplicative terms is the Gollob (1968) F-test. Cornelius (1993) showed that this test is very liberal and typically results in too many multiplicative terms being judged significant.

Using the AMMI model, Gauch (1988) and Gauch and Zobel (1988) proposed a random data splitting and cross validation scheme in which some replicates of each genotype–environment combination are used for modelling and the remaining replicates are used for validation. The model with the smallest root mean square predictive difference (RMSPD) is selected as the best predictive model. However, there are two major problems with this approach: prediction using all replications is not possible because at least one replicate has to be used for cross validation and the procedure does not consider any block structure given by the experimental field design. Therefore, both the modelling data and the validation data are not free of the block effects. Cornelius and Crossa (1999) overcame the problem of block effects by performing the cross validation on data adjusted by replicate differences within each environment. Piepho (1994) used r - 1 replicates at each site for modelling and the remaining complete replicate for validation (r is the total number of replicates per site).

Cornelius et al. (1993)(1996) obtained shrinkage estimators of multiplicative models (AMMI, SREG, GREG, COMM, and SHMM) for estimating the realized performance level of cultivars in the testing environments and presented evidence that predictive accuracy of shrinkage estimators was at least as good as, and often better than, the better choice of truncated multiplicative models and the best linear unbiased prediction (BLUP) of the cell means using a two-way random effects model.

Although there have been many studies showing the practical and theoretical advantages of applying AMMI to multienvironment trials, a theory about the structure of the error and genotype x environment variance components involved in the AMMI model is lacking. The development of such a theory would allow determination of which part of the sum of squares associated with the orthogonal partition of the GEI obtained by the AMMI PC axes is due to random errors (or unstructured GEI) and which part is due to structured GEI. Applications of this theory using all trial replicates and/or considering the block structure of the design as well as its use in the context of random data splitting and cross validation should be useful for selecting the best truncated AMMI model.

The main objectives of this study were (i) to develop an eigenvalue partition (EVP) method for estimating the relative contribution of interaction variance and error variance to variation explained by orthogonal PCs when an AMMI model is fitted; (ii) to develop a criterion, root mean square predictive differences on the basis of the EVP method (RMSPDEVP), for selecting the best truncated AMMI model; (iii) to evaluate the EVP method and the RMSPDEVP criterion in three multienvironment cultivar trials with randomized complete block designs (RCBD) and simulation data generated from these trials for selecting the best truncated AMMI; and (iv) to validate the EVP method by comparing results of the RMSPDEVP criterion with those obtained by the RMSPDCV criterion conventionally used to choose a truncated AMMI model by cross validation.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 CONCLUSIONS
 REFERENCES
 
Theory of the Eigenvalue Partition (EVP) Method
We develop a method for partitioning the sum of squares (SS) associated with each PC axis of the AMMI model into a structural and an error component. In multienvironment cultivar trials of m genotypes (i = 1,..., m), n environments (j = 1,..., n), and r replicates (t = 1,..., r) arranged in RCBD, the linear model for the conventional analysis of variance (ANOVA) is

where, yijt is the observation of the tth replicate of the ith genotype in the jth environment; µ is the overall mean; gi, ej, r(e)jt, and (ge)ij are the effects of the ith genotype, the jth environment, the tth replication within the jth environment, and the (ij)th interaction, of the ith genotype with the jth environment respectively; {epsilon}ijt is the experimental error associated with the (ijt)th observation and is assumed to be independently distributed with homogeneous error variance, {sigma}2e.

The following expressions hold:

where yij, aij, and zij, are the observed cell mean, the additive main effects component, and the estimated GEI effect for the (ij)th cell mean, respectively. Also, yi., y.j, and y.. are the means of the ith genotype, the jth environment, and the overall mean, respectively. The additive main effects component of the (ij)th cell includes the additive main effects ei and gj and the overall mean.

The AMMI model has been used for analyzing multienvironment cultivar trials as follows:

[1]
where yij is the cell mean of the ith genotype in the jth environment; {lambda}k is the singular value (square root of the eigenvalue) for the kth PC axis; interaction parameters {alpha}ik and {gamma}jk are elements of the kth singular vector for genotypes and environments, respectively, and are interpretable as scores for the contribution of the ith genotype and the jth environment to the kth PC; {epsilon}ij is the residual, which includes the residual interaction not accounted for by the multiplicative terms and the experimental error variance; q is less than or equal to the smaller of [(m - 1), (n - 1)]. In the saturated AMMI model, the maximum number of principal components is q. Interaction parameters of the AMMI multiplicative terms are estimated by the singular value decomposition of the two-way table of the observed interaction effects zij (residuals after fitting the additive main effects of sites and genotypes) (Gabriel, 1978).

Variance Components of Estimated Parameters
Additive main component aij. The error variance associated with aij can be derived from the expression, aij = yi. + y.j - y.., by decomposing each term into independent observations. This results in:

[2]
where {sigma}2e is the error associated with an observation.

Estimated GEI effect zij. The estimated and true structural GEI effects are related through the following expression

This can be written in matrix notation as

[3]
where zij is the estimated GEI effect, ßij is the true structural GEI effect, and {delta}ij is the error effect contributing to zij; Z, B, and D are the matrices of zij, ßij, and {delta}ij effects, respectively. As can be shown from expected mean squares for GEI in the ANOVA of a multienvironment cultivar trial, the variance components of zij on the basis of cell means are

where {sigma}2GE is the expected variance of the structural GEI effects (the GEI variance component).

Multiplicative interaction of the AMMI model. The predicted GEI effect z*ijp for each cell mean in the AMMIp model is the multiplicative term of Eq. [1].


where zijk is the kth PC term of z*ijp; and p is the number of first PC axes retained in the AMMIp model.

Another way to express zijk is

[4]
where n is assumed to be the smaller of (m, n); jk, and vk are the jth and vth scores of the kth eigenvector in the two-way table of zij elements, taking the n environments (columns) as variables; subscript v refers to the variable columns (v = 1, 2,.. n). If m < n, replace n with m in Eq. [4]. All elements of Eq. [4] can be expressed in matrix form as follows:

[5]
where, Zk and Z are the matrices whose elements are zijk (i.e., the kth orthogonal partition of zij by the AMMI model) and zij, respectively; and k is the kth eigenvector of Z'Z. Equation [5] is equivalent to the kth component of the singular value decomposition method for computing estimates of the AMMI parameters because Z {gamma}k = {lambda}k {alpha}k (Gabriel, 1978).

Sum of squares (SS). The SS for Eq. [4] is

and from this expression, according to the PC theory and considering Z = B + D (Eq. [3]), we can obtain

[6]
where, 2k is the SS of the zijk elements (i.e., the kth eigenof squares and cross products matrices of interactions effects value of the AMMI model); S, SB,B, and S{delta},{delta} are the sum zij, structural GEI effects ßij, and error effects {delta}ij, respectively; k is the kth eigenvector of matrix S; Sß,{delta} = (1/2) (B'D + D'B) is the covariance of the ßij effects with the {delta}ij effects. Since ßij and {delta}ij are independent, Sß,{delta} is expected to be a zero matrix, but for a particular sample of observations, Sß,{delta} may not be the zero-matrix. The S matrix is computed as Z'Z, where Z is the (m x n) matrix with elements zij. If the smaller of (m, n) were m instead of n, S would be computed as ZZ'.

Since no reliable way to compute Sß,{delta} is available, 2k can be partitioned out into two components as,

where G2k and Ê2k are the structural GEI and error component estimates of 2k, respectively. Since k is estimated from the Z'Z matrix that includes both ßij and {delta}ij effects, k also depends on both structural interaction and error effects. Thus, the error component Ê2k cannot be directly estimated from Eq. [6] as 'k S{delta},{delta} k. Instead, we will obtain an estimate by the following data resampling scheme. Specifically Ê2k will be obtained as

where kS{delta},{delta} is the kth eigenvector of an error matrix S{delta},{delta} that we will compute as

[7]
where Zt and Z are the matrices of the GEI effects for repetition t and the average matrix across all repetitions, respectively; and r is the number of repetitions. A prior adjustment of the observations for block effects at each site should be made if trials are laid out in randomized complete block designs. Zt = Yt – A, where Yt and A are m x n matrices of replicate-adjusted yijt values and aij estimates, respectively. The Ê2k estimator is a function of squares and products of {delta}ij effects. The eigenvalue estimate of the last PC axis from matrix S{delta},{delta} may not be zero, because the rank of matrix Zt in Eq. [7] is n and not n - 1, where n is assumed to be the smaller of (m, n). Since the parameter matrix S{delta},{delta} has rank n - 1, its nth PC eigenvalue is zero. Thus, the last PC eigenvalue Ê2n of estimated matrix S{delta},{delta} should be set to zero, regardless of its estimated value. In turn, the Ê2n estimated value is assumed to be absorbed by the remaining n - 1 PC axes proportionally to each Ê2k estimate to keep the sum of S{delta},{delta} eigenvalues invariant, as follows:

where Ê2ak is the adjusted E2k estimate.

Now, G2k can be estimated as

[8]

G2k mainly includes the structural GEI effects ßij but may also include other covariance effects due to sampling, such as products of ßij and {delta}ij effects. The restriction G2k >= 0 for each k was imposed on the model.

Adjustment of G2k and Ê2ak estimates. If eventually one or several G2k are negative because of sampling when analyzing any particular data set, the negative G2k are set to zero. Thus, the remaining positive G2k estimates have a total upward bias equal to the sums of the negative G2k with opposite sign. Then, better estimates of positive G2k can be found by reducing the total bias proportionally to the G2k magnitude, as follows:

where G2ak is the adjusted G2k estimate and c is the quotient between the sum of all negative G2k estimates and all positive G2k estimates. If Gk = 0, then each G2k = 0.

Now the Ê2ak in Eq. [8] can also be adjusted again to keep each {lambda}2k invariant such that Ê2aak = 2k - G2ak, where Ê2aak is the second adjusted Ê2k estimate.

Coefficients of the Variance Components of the Eigenvalues
For each PC axis, the EVP method divides the sum of squares into


gk and êk are the estimated portions of 2k attributable to the structural and error variance components, respectively.

Thus, the SS components of the AMMIp model for the first p PC axes are

[9]
for the structural GEI component, and

[10]
for the error component, where

i.e., gcp and êcp are the cumulative coefficients of the AMMIp model.

Prediction of the RMSPDEVP Criterion
The RMSPD criterion based on cross validation of split data has been frequently used to choose and validate the best AMMI model (Gauch and Zobel, 1988, 1989; Crossa et al., 1990). We propose another criterion that does not require splitting of data. The criterion is the prediction of RMSPD based on the EVP approach (RMSPDEVP).

[11]
where Ê refers to the estimated expectation, aij, z*ijp, and yijv are the additive and GEI effects of the AMMIp model and any independent validation observation for the ij cell, respectively; terms (m + n - 1) {sigma}2e/mnr, êcp 2e/mnr, and 2e are the empirical error components associated with aij, zijp, and yijv, respectively; (m - 1)(n - 1)(1 - gcp) 2GE/mn is the empirical variance component associated with the difference between the full true GEI effects and those predicted by AMMIp. The best truncated AMMI model will be the one that shows the smallest RMSPDEVP value.

Experimental Data
Partition of SS eigenvalues (Eq. [9] and [10]) was used to analyze three multienvironment trials. Trial 1 was a multienvironment trial coordinated by the Spanish National Seed Institute and included 16 triticale (x Triticosecale Wittmack) cultivars arranged in a randomized complete block design with four replications evaluated at 10 environments in Spain during 1989. The data were used by Royo et al. (1993) for an AMMI analysis. Trial 2 was a CIMMYT maize (Zea mays L.) trial with eight genotypes arranged in a randomized complete block design with four replications at each of 33 international sites during 1987. Trial 3 comprised 11 broad bean (Vicia faba L.) genotypes arranged in a randomized complete block design with three replications grown in 10 environments in southern Spain. Data of Trial 3 were extracted from Cubero and Flores (1995). Data from these three trials were used by Moreno-González and Crossa (1998) for studying the GEAR models.

Simulation Data
To determine the validity of the EVP approach on a wide range of error effects, simulation data sets were generated from the original empirical data by adding to each observed cell mean a random error component for each plot in each trial. The random error effects of simulation data sets came from normal distributions with mean zero and arbitrary standard deviations of 2500, 1000, 600, 400, 240, and 100 kg ha-1. Fifty simulation data sets were generated for each arbitrary standard deviation value in Trials 1 and 2, and 67 simulation data sets were generated for each situation in Trial 3.

Random Data Splitting
Simulation and empirical data were first adjusted to remove the block effects at each site (Cornelius and Crossa, 1999). For all trials, data were randomly split into model data and the validation data, which included one replication (rv = 1). The model data included the remaining replications; i.e., three for Trials 1 and 2 (rm = 3) and two for Trial 3 (rm = 2). Four random splitting events were performed on each of the 50 generated simulation data sets in Trials 1 and 2, in such a way that the four replicates of each cell were involved in the four validation data sets as follows. Random numbers 1 to 4 were assigned to the replicates of each cell. Each validation data set was formed with the cell replicate bearing the same assigned number. Likewise, three random splitting events were done on each of the 67 generated simulation data sets in Trial 3, in such a way that the three replicates of each cell were involved in the three validation data sets, by a procedure similar to that described for Trials 1 and 2. For empirical data sets, random splittings similar to those described before were also used, but assignment of random numbers to replicates was done 50 times in Trials 1 and 2, and 67 times in Trial 3. In total, 200 model data sets were validated for Trials 1 and 2, and 201 for Trial 3.

Comparison of the RMSPDEVP and RMSPDCV Criteria
The AMMI model was applied to the cell means of all model data. The conventional RMSPD based on cross validation of split data (RMSPDCV) between the predicted AMMI and the validation cell means was averaged across all cells and split data sets (Gauch, 1988; Gauch and Zobel, 1988, 1989; Crossa et al., 1990) for each study case. The RMSPD based on the EVP theory (RMSPDEVP) was also estimated (Eq. [11]) for the same AMMIp modeling data subset. Parameters of Eq. [11] were estimated from the modeling data. Comparison of RMSPDEVP and RMSPDCV was used to measure reliability of RMSPDEVP as a criterion for choosing the best truncated AMMIp model.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 CONCLUSIONS
 REFERENCES
 
Eigenvalue Partition Method and the Different Cases
The GEI variance in the truncated AMMIp model has two components: (i) the structural component estimated as gcp 2GE and (ii) the error variance component estimated as êcp 2e/r. In the empirical data, estimates of 2GE were 410824, 138921, and 104923, and those of 2e were 169135, 422913, and 124568 for Trials 1, 2, and 3, respectively (Table 1) . In the simulation data, the initial {sigma}2GE parameters before adding errors to the cell means were 453107, 244650, and 146425 for Trials 1, 2, and 3, respectively (Table 1).


View this table:
[in this window]
[in a new window]
 
Table 1. Initial parameters of structural genotype x environment interaction (GEI) cumulative coefficients (gcp) of the principal component (PC) axes, and structural GEI variance component for the simulation data from Trials 1, 2, and 3; and estimates of the structural GEI variance component , the error variance component , and the mean () for empirical data in Trials 1, 2, and 3.

 
Coefficients gk and êk estimate the portion of the total structural GEI and error SS explained by the kth PC axis in the AMMI models, respectively. They can also be regarded as the portion of the GEI and error variance components estimated in the conventional ANOVA. Tables 2 through 4 show the estimates of the cumulative coefficients, gcp and êcp, for the different cases studied in Trials 1 to 3. Single random S and S{delta},{delta} matrix estimates were used for computing the variance component coefficients gcp and êcp of each individual model data set. Results of the EVP method showed that the cumulative coefficients estimating the portion of the structural GEI variance component were larger than coefficients estimating the portion of the error variance component for all cases, regardless of the type of trial, the error magnitude, and the number of replicates.


View this table:
[in this window]
[in a new window]
 
Table 2. Estimates of cumulative coefficients of the structural genotype x environment interaction (gcp) and error (êcp) variance components, and root mean square predictive difference, based on the EVP theory (RMSPDEVP) and cross validation procedure (RMSPDCV), for the first p principal component (PC) axes of the additive main effects and multiplicative interaction (AMMI) model for different simulation data generated from the triticale trial and the triticale empirical data of Trial 1, averaged over 200 data sets.

 

View this table:
[in this window]
[in a new window]
 
Table 4. Estimates of cumulative coefficients of the structural genotype x environment interaction (gcp) and error (êcp) variance components, and root mean square predictive difference, based on the EVP theory (RMSPDEVP) and cross validation procedure (RMSPDCV), for the first p principal component (PC) axes of the additive main effects and multiplicative interaction (AMMI) model for different simulation data generated from the faba beans trial and the faba beans empirical data of Trial 1, averaged over 201 data sets.

 

View this table:
[in this window]
[in a new window]
 
Table 3. Estimates of cumulative coefficients of the structural genotype x environment interaction (gcp) and error (êcp) variance components, and root mean square predictive difference, based on the EVP theory (RMSPDEVP) and cross validation procedure (RMSPDCV), for the first p principal component (PC) axes of the additive main effects and multiplicative interaction (AMMI) model for different simulation data generated from the maize trial and the maize empirical data of Trial 2, averaged over 200 data sets.

 
Results from simulation data showed that the initial gcp coefficient parameters, before adding errors to the cell means (Table 1), were close to the gcp estimates when the standard deviation was 100 in all trials (Tables 24). As the standard error involved in the predicted cell means increased, the estimates of the structural GEI cumulative coefficients of first PC axes decreased in Trials 1 and 3. Means of cumulative coefficient estimates of first two PC axes gc2 were 0.877, 0.868, 0.821, and 0.716 in Trial 1 (Table 2) and 0.788, 0.765, 0.666, and 0.591 in Trial 3 (Table 4) for model simulation data with standard errors 100, 400, 1000, and 2500, respectively. However, the cumulative gc2 coefficient increased from 0.597 to 0.653 as the standard error increased from 1000 to 2500 in the maize simulation data for Trial 2 (Table 3). These results from Trials 1 and 3 are apparently discrepant with those of Trial 2. Simulation data were generated from the empirical trials by adding a random error to the cell means. It seems logical to think that as greater noise (errors) is added to the cell means, the association structure among observed cell means becomes weaker. Thus, the portion of variation explained by the first PC axes becomes smaller. This may explain the decrease of structural GEI cumulative gc2 coefficients as error increases in Trials 1 and 3 (Tables 2 and 4). A different explanation is suggested for Trial 2. Initial GEI cumulative coefficients of original cell means before adding the simulation error effect were much smaller in Trial 2 than in Trials 1 and 3 (Table 1). Thus, there was not enough room for the decrease of first gcp, as error increased in Trial 2. The rise in the first gcp estimates when the standard error increased from 1000 to 2500 may be attributed to a high frequency of negative G2k as error increased.

Means of error cumulative coefficient estimates êcp increased as the simulation standard error increased from 100 to 2500 in all trials (Tables 24), when second adjusted Ê2aak estimates were considered. For example, êc2 estimated means were 0.390, 0.403, 0.432, and 0.459 in Trial 1; 0.401, 0.402, 0.404, and 0.424 in Trial 2; and 0.314, 0.536, 0.555, and 0.550 in Trial 3, for standard errors 100, 400, 1000, and 2500, respectively. However, when first adjusted Ê2ak estimates were considered, the êcp means remained approximately constant for each PC axis in each trial as the standard error increased from 100 to 2500 (data not shown). Furthermore, each simulation trial had a characteristic distribution pattern of the êcp coefficients (i.e., the Ê2ak) along the PC axes. This characteristic distribution of Ê2ak does not depend on the magnitude of the overall error, rather it might be attributed to the number of replications, genotypes, and environments in the trial.

Results from full data and modelling subset data in empirical trials were similar to those discussed above for the simulation data. The structural GEI cumulative coefficients of first PC axes gcp were larger for full empirical trials than for model empirical data, likely because a smaller error is associated with the cell means of full trials than with those of model data. For example, gc2 estimates were 0.926, 0.751, and 0.886 for full data, whereas they were 0.921, 0.724, and 0.867 for model data in Trials 1, 2, and 3, respectively (Tables 24). Conversely, the proportionate contribution of error variance to the first PC axis was smaller for full empirical trials than for model empirical data, likely because a higher number of replicates was involved in the full trials. For example, estimates of cumulative first two êc2 were 0.421, 0.401, and 0.546 for full data, whereas they were 0.450, 0.423, and 0.594 for model data in Trials 1, 2, and 3, respectively (Tables 24).

Comparisons of RMSPDEVP and RMSPDCV Criteria
Both the RMSPDEVP and RMSPDCV estimates were computed on 200 and 201 single model data sets for each simulation or empirical situation in Trials 1 and 2, and in Trial 3, respectively. The EVP method was applied to model data subsets involving three replications for Trials 1 and 2, and two replications for Trial 3. Estimates of gcp and êcp coefficients were further used to compute the RMSPDEVP predictions (Eq. [11]). The RMSPDEVP was obtained from the same model data subsets as used for RMSPDCV, but the RMSPDCV criterion requires one additional replication as validation data for empirically estimating the root mean square differences between predicted AMMI and observed cell means (Gauch, 1988). RMSPDEVP predictions were similar to the RMSPDCV estimates for all PC axes, simulation standard errors, replications, and trials studied (Tables 24). The two approaches were coincident when all PC axes (saturated model) or no PC axis (AMMI0; the additive model) were involved in the RMSPD estimates. The RMSPDEVP predictions were slightly smaller than the RMSPDCV estimates for intermediate PC axes, especially when large standard errors were involved.

In this study, the smallest validation RMSPD value has been used to select the number of PC axes involved in the best AMMI model (Gauch and Zobel, 1988, 1989; Crossa et al., 1990). The relative ranking of the RMSPDEVP predictions with varying number of PC axes retained was similar to the ranking of the validation RMSPDCV estimates along the PC axes for most of the cases and trials. Furthermore, the number of selected PC axes with the smallest RMSPD estimate either coincided fairly well or was very close in both the EVP method and the AMMI cross validation approach for most of the studied cases. Some exceptions were found. For example, in the model data of the empirical triticale Trial 1 with three replicates (Table 2), the EVP method selected the first five PC axes, while the conventional AMMI approach selected the first two PC axes, AMMI5 being the next best model. RMSPDCV estimates of both AMMI2 and AMMI5 were very close, 471.8 and 473.3, respectively. In this example, RMSPDCV estimates from AMMI0 to AMMI9 followed a saw tooth pattern; it first decreased until the second PC, then increased up to the third PC, went down again to the fifth PC, increased again up to the eight PC, and finally went down to the ninth PC. But the relationship of RMSPDEVP to number of terms retained displayed a smoother pattern, in all cases monotonically decreasing to a minimum, then monotonically increasing. Another discrepant case between the EVP and the AMMI methods was found in the simulation data of Trial 3 for standard deviation = 400 (Table 4), where the EVP method selected the first five PC axes, whereas conventional AMMI selected only the first PC axis. In this case, the RMSPDCV estimates also followed a saw tooth pattern, where the second best model was AMMI5, which had been selected by the EVP method.

The RMSPDEVP Criterion
The RMSPDEVP criterion selected a similar number of PC axes as did, in most cases, the cross validation method proposed by Gauch and Zobel (1988). Even for the discrepant cases, the RMSPDEVP seemed to be realistic, not showing a saw tooth pattern. Another advantage of the RMSPDEVP criterion over the cross validation RMSPDCV is that the latter cannot be performed for the full data because no replication is needed for validation. Thus, RMSPDEVP criterion can be adopted for mean prediction of the AMMI model because it can be used for the full data.

As the error associated with the cell means decreased, and the number of replications involved in the estimates increased, the RMSPD decreased, indicating better precision of predicted means. The use of all replicates is the best option for mean prediction because the error associated with cell means decreases. The number of PC axes selected for the best AMMI model depends on the error associated with cell means and the number of replicates involved in the model data.

In the simulation data, both methods selected the saturated model (i.e., the cell means) in all trials for the small standard errors of 100 and 240. On the contrary, both methods selected the AMMI0 (i.e., the additive model) in all trials for the largest standard error 2500. Different numbers of PC axes were selected in the trials for the intermediate standard errors of 400, 600, and 1000 (Tables 24).

For Trials 1 and 3 data, the EVP method selected the same number of PC axes when the entire data set is used than when one less replicate is used. However, the EVP method selected three PC axes in the entire data of Trial 2 with four replicates and only two PC axes in the model data including three replicates. Thus, as the error decreased and the number of replicates increased, the number of selected PC axes increased.


    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 CONCLUSIONS
 REFERENCES
 
The conventional random data splitting and cross validation procedure cannot be trusted to identify the best truncated model obtainable from the entire data set because at least one replication should be left for validation. Another problem that the conventional cross validation procedure does not consider is the noise (error) in the modelling and validation data that occurs as a consequence of ignoring block (replicate) differences within environments. Cornelius and Crossa (1999) suggested that if cross validation is used, data should be adjusted by replicate differences within environments. The EVP method along with the RMSPDEVP criterion for selecting the best-truncated AMMI model allows the use of all trial replicates and adjustment for block effects.

Matrices Zt (Eq. [7]) were formed from observations of replicate t after adjustment for block effect and after taking out the main effects of the cell means. Following this adjustment, any random cell replicate could be used to form matrix Zt. To study the effect of random choice of cell replicates on selection of the best AMMI model when using the RMSPDEVP criterion, 200 single data sets involving random Zt were individually analyzed for full data of each empirical trial. The best AMMI model was selected in 173, 199, and 184 out of 200 single events performed in Trials 1, 2, and 3, respectively (data not shown). Differences between RMSPDEVP estimates from missclassified and correct choices were very small. Furthermore, the missclassified RMSPDEVP estimates always selected an AMMI model closest to the best one. It appears that little error would be perpetrated if RMSPDEVP estimates from a single data set were used to select the best AMMI model, however, it is suggested to average RMSPDEVP estimates from eight to 10 random data sets to avoid missclassification of the best AMMI model.

In general, results from the EVP method and cross validation agreed with Cornelius and Crossa (1999), in the sense that prediction assessments done on model data comprising fewer replicates is noisier, and therefore less precise (i.e., larger error), than prediction assessment done on model data with more replicates. Furthermore, results of this study suggested that the expected interaction mean squared error for the predictively best truncated AMMI model decreases as error variance decreases and as the number of replicates increases.

Cornelius et al. (1993) and Cornelius and Crossa (1995)(1999) developed shrinkage estimators of multiplicative models that seem to eliminate the need for either cross validation or tests of hypotheses as criteria for determining the number of multiplicative terms to be retained in a multiplicative model. In a companion paper to this paper, Moreno-González et al. (2003) use the theory of the EVP method presented in this study to develop shrinkage factors for the AMMI model, which are similar to the shrinkage factors developed by Cornelius et al. (1993) and Cornelius and Crossa (1995)( 1999).


    CONCLUSIONS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 CONCLUSIONS
 REFERENCES
 
It is possible to separate the structured GEI from the nonstructured GEI by partitioning the total GEI variance into GEI and error variance components associated with the AMMI predicted means. The EVP method provided unbiased estimates of variance components for the additive main effects and approximated estimates of variance components for the multiplicative interaction effects. The RMSPDEVP criterion seems useful for selecting the best truncated models and has some advantages over the traditional cross validation criterion, namely RMSPDCV, and it can be applied to all trial replications and to RCBD. Application of the EVP method is simple because it does not need data splitting, and the computation tools for matrices S, S{delta},{delta}, and Zt are available in many statistical packages.


    ACKNOWLEDGMENTS
 
Research was partially funded by INIA, Spain grants SC97-074 and RTA01-140.

Received for publication December 18, 2002.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 CONCLUSIONS
 REFERENCES
 




This article has been cited by other articles:


Home page
Crop Sci.Home page
H. G. Gauch Jr.
Statistical Analysis of Yield Trials by AMMI and GGE
Crop Sci., May 18, 2006; 46(4): 1488 - 1500.
[Abstract] [Full Text] [PDF]


Home page
Crop Sci.Home page
J. Moreno-Gonzalez, J. Crossa, and P. L. Cornelius
Additive Main Effects and Multiplicative Interaction Model: II. Theory on Shrinkage Factors for Predicting Cell Means
Crop Sci., November 1, 2003; 43(6): 1976 - 1982.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (3)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Moreno-González, J.
Right arrow Articles by Cornelius, P. L.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Moreno-González, J.
Right arrow Articles by Cornelius, P. L.
Agricola
Right arrow Articles by Moreno-González, J.
Right arrow Articles by Cornelius, P. L.
Related Collections
Right arrow Biometrics


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
The SCI Journals Agronomy Journal Vadose Zone Journal
Journal of Natural Resources
and Life Sciences Education
Soil Science Society of America Journal
Journal of Plant Registrations Journal of
Environmental Quality
The Plant Genome