|
|
||||||||
a Alberta Agriculture, Food and Rural Development, Room 300, 7000–113 Street, Edmonton, AB, Canada T6H 5T6, and Dep. of Agricultural, Food and Nutritional Science, Univ. of Alberta, Edmonton, AB, Canada T6G 2P5
b Crop Diversification Centre North, Alberta Agriculture, Food and Rural Development, RR6, 17507 Fort Road, Edmonton, AB, Canada T5B 4K3
c Biometrics and Statistics Unit, International Maize and Wheat Improvement Center (CIMMYT), Apdo. Postal 6-641, 06600 Mexico D.F., México
d Crop Diversification Centre South, S.S. #4, Alberta Agriculture, Food and Rural Development, Brooks, AB, Canada T1R 1E6
* Corresponding author (rong-cai.yang{at}ualberta.ca).
| ABSTRACT |
|---|
|
|
|---|
Abbreviations: AFPRVT, Alberta Field Pea Regional Variety Test GEI, genotype x environment interaction UPGMA, unweighted pair-group method using arithmetic averages
| INTRODUCTION |
|---|
|
|
|---|
With recent interest in diversification of crops, aiming at enhancing the long-term sustainability of agriculture in western Canada, nontraditional crops such as field pea have been increasingly incorporated into the farming system in the Canadian Prairies. In the Province of Alberta, field pea is the most cultivated nontraditional crop, accounting for about 55% of the total acreage for these crops (Olson et al., 2001). As field pea production has been expanded to all possible growing areas of the Province, demand for new cultivars with high and stable yields is increasing. Since 1987, Alberta Agriculture, Food and Rural Development has coordinated the Alberta Field Pea Regional Variety Test (AFPRVT) Program to conduct multiyear and multisite testing to recommend cultivars to pea producers across the province. These multienvironment data are routinely averaged on a regional (geographic) basis across years (Park and Lopetinsky, 1999). Clearly, this geography-based criterion for cultivar selection does not address the three issues described above, and thus may not be reliable for choosing appropriate cultivars according to site production levels.
In this study, we propose a performance-based approach to grouping test sites for cultivar recommendation. We coin the term isoyield environments to describe those sites that are homogeneous in their yielding ability, but not necessarily contiguous in their geography. The concept of isoyield environments is very similar to that of megaenvironments (Gauch and Zobel, 1997), but with a focus on the site performance in terms of yielding ability. We use this approach to examine patterns of isoyield groups for the field pea trials conducted from 1997 to 2001.
| MATERIALS AND METHODS |
|---|
|
|
|---|
|
|
ij. be the average yield of the ith (i = 1, 2, ..., g) field pea cultivar across 3 or 4 replications in the jth (j = 1, 2, ..., e) test site in a given year. We first conducted the baseline analysis that partitions the value of
ij. into the effect of the ith cultivar (
i), the effect of the jth test site (
j), and the interaction between these two effects (
ij) under the classic two-way fixed effects model,
![]() | [1] |
ij. is assumed to be normally and independently distributed with mean zero and variance
2/n (where n is the number of replicates which, in this case, is n = 3 or 4). The GEI effect (
ij) could be further studied by means of different statistical analyses, including stability analysis based on regression models (Finlay and Wilkinson, 1963) or linear–bilinear models (Zobel et al., 1988; Cornelius et al., 1992; Crossa and Cornelius, 1997) and likelihood analysis based on mixed models (Piepho, 1999; Yang, 2002).
For our subsequent cluster analysis, we chose the regression-based stability analysis for deriving dissimilarity indexes between pairs of sites, using a modification of Method 1 of Lin and Butler (1990), with the roles of cultivars and sites being swapped. The dissimilarity index between a pair of sites is the difference between residual sums of squares, after fitting a regression on the cultivar index using the data from both sites and after fitting two separate regressions, one for each site. Adopting the approach of Finlay and Wilkinson (1963), we used the following regression model, examining the stability of the sites rather than the stability of the cultivars:
![]() | [2] |
j is the mean of the jth site, bj is the coefficient of linear regression of
ij. on the cultivar mean wi, and dij is the deviation from the linear regression (the unexplained portion of interaction). We prefer this regression-based analysis for two reasons. First, the direct connection between the cluster analysis and the regression analysis enabled us to establish an empirical cutoff point from the dendrogram based on the F-test statistic (the ratio of the smallest dissimilarity index to the estimated error mean square), so that the number of isoyield groups could be impartially identified. Second, the estimated site means and slopes (the site x cultivar interaction) for individual sites were valuable in selecting appropriate test sites from the isoyield groups identified by the cluster analysis. For the hierarchical cluster analysis and dendrogram construction, we computed the dissimilarity index between pairs of sites for each year using the regression model as described in Eq. [2]. Thus, the dissimilarity indexes derived in this manner would be the numerators of the F-test statistics for a common regression between any two sites. Extending this concept to more than two sites, as shown in Lin and Butler (1990), the dissimilarity index between any two clusters (each involving one or more sites) would be the numerator of the F test for similarity of the two clusters so long as the sites were grouped according to Sokal and Michener's (1958) unweighted pair-group method. Using this clustering method, a dissimilarity index between a pair of clusters was calculated as the average of dissimilarity indexes between all pairs of sites within and among clusters.
These between-cluster dissimilarity indexes were calculated by invoking the SPSS CLUSTER procedure with the METHOD subcommand being equal to WAVERAGE (SPSS, 2002). However, they should not be confused with those given by the method of average linkage between clusters (groups), commonly known as unweighted pair-group method using arithmetic averages (UPGMA). An UPGMA-based dissimilarity index would be an average of the dissimilarity indexes between pairs of sites from different clusters as calculated in the SAS PROC CLUSTER with METHOD = AVERAGE option (SAS Institute, 1999) or the SPSS CLUSTER procedure with the METHOD subcommand being equal to BAVERAGE (SPSS, 2002). The denominator of the F tests was the MSE left unaccounted for after fitting regressions for individual sites. Thus, an empirical cutoff point for the dendrogram constructed from the cluster analysis was established based on the F-test statistic (ratio of the smallest dissimilarity index at each cycle of grouping to the estimated MSE). In other words, the cycle at which the calculated F ratio exceeded its critical value would be considered an appropriate cutoff point.
The across-year analysis had a number of difficulties, including highly unbalanced data in year x site x cultivar combinations and considerable differences in site x cultivar means across years. To overcome these difficulties, we normalized the yield data at each site in each year to create the following 10 cultivar classes: (–
, –2sij), (–2sij, –sij), (–sij, –0.5sij), (–0.5sij, –0.2sij), (–0.2sij, 0sij), (0sij, 0.2sij), (0.2sij, 0.5sij), (0.5sij, sij), (sij, 2sij), and (2sij,
), where sij is the standard deviation for the ith year and jth site. The use of 10 classes was an act of balance between the need to have sufficient data points for the regression analysis and to have at least one observation in each class. While the boundary values set for each cultivar class were somewhat arbitrary, the classes would have the expected frequencies of 0.0228, 0.1359, 0.1499, 0.1122, 0.0793, 0.0793, 0.1122, 0.1499, 0.1359, and 0.0228, if the data were distributed according to a normal distribution. Thus, the site x cultivar class means across years and cultivars were calculated for the regression analysis. The site x site matrix of dissimilarity indexes derived from the regression analysis was used in the cluster analyses, just as done for individual years to generate the dendrogram. The number of distinct isoyield groups from the dendrogram was determined from the F tests described above.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
Table 2 presents the combined analyses of variance for individual years and for the averages across years (using the normalization procedure explained earlier). While not entirely comparable, site variation from the analysis based on the across-year averages was much less than that from any individual-year analysis. Likewise, the CV from the across-year analysis was also the lowest, compared with those from the individual-year analyses. The effects due to cultivars, sites, and their interaction under the across-year analysis were all significant. On the other hand, while the cultivar and site effects were significant, the site x cultivar interaction was significant from 1999 to 2001, but not in 1997 and 1998. In fact, the F ratios of mean squares for site x cultivar and for pooled error were less than unity (F < 1) in 1997 and 1998, but were between 2.45 to 5.72 from 1999 to 2001.
|
Isoyield Groups
The cluster analysis and subsequent F tests based on dissimilarity indexes calculated for pairs of sites or clusters of sites led to classification of test sites into different numbers of isoyield groups in individual years: six in 1997, 10 in 1998, 2000, and 2001, 12 in 1999, and seven across the 5 yr (Table 3). The dendrogram with a cutoff point (the vertical dashed line) from the across-year analysis is portrayed in Fig. 2
. The different letters in each column of Table 3 identified different isoyield groups, and the regression lines of sites within an isoyield group would not be significantly different from one another at the 0.05 probability level according to the F tests. It was evident that the sizes of isoyield groups were smaller in individual years than across years.
|
|
|
1); 1997 was the second best year with an average yield performance, but above-average stability (b
0); 1998 and 2000 were not good environments, as they were either unstable (1998) or had low yield performance (2000). The site pairing within the isoyield groups involving Standard was quite inconsistent across the 4 yr. Standard was paired with Vegreville and Fairview in 1997, with Namao and Bow Island (irrigated) in 1998, with no other sites in 1999, and with Bow Island (dryland) in 2000. Unpredictable year-to-year weather fluctuation typical in the Canadian Prairies may be the possible cause of yield variation and site instability across years. Thus, averaging across years and cultivar classes as we did in the across-year analysis would have filtered out much of the year-to-year variation so that the resultant averaged yields would be close to the true site averages. This is certainly consistent with the result from the across-year analysis, showing only seven isoyield groups of 34 sites compared with 10 to 12 isoyield groups with 22 or fewer sites in individual years, except for 1997 (six isoyield groups with 11 sites only). For those sites with 1-yr data (i.e., Acadia Valley, Three Hills, Paradise Valley, Manning, and St. Isidore), the b values calculated from individual years and across years were somewhat different because the cultivar index used as an independent variable in the regression analysis was calculated from yields of actual cultivars in the individual year, but from average yields of cultivar classes (derived from normalization) across years. Nevertheless, such differences were not appreciably large for all cases involved, suggesting the normalization procedure is probably adequate for combining the data across years.
A question naturally arises whether or not the fit of a linear relationship, as described above, is good. Testing for significance of b values would usually be considered. However, it should be emphasized that the estimates of stability (b values) from the Finlay-Wilkinson's regression analysis are data-based indexes for descriptive purposes, but not for prediction. For a prediction model, the independent variable must be measured before the experiment, but not derived after the experiment as in the Finlay-Wilkinson's regression analysis (Lin and Binns, 1994). Thus, the goodness-of-fit of the linear regression would be best judged by how much of the variation could be accounted for by the model (Crossa, 1990; Lin and Butler, 1990). It is suggested that a b value, regardless of its magnitude, should be a useful indicator of response characteristics if the coefficient of determination (r2) is at least 50% (Lin and Butler, 1990). In our present study, the r2 values were 50% or higher in 3 of 11 sites in 1997, 5 of 20 sites in 1998, 0 of 22 sites in 1999, 0 of 20 sites in 2000, and 9 of 21 sites in 2001. Clearly, the linear regression model was generally inadequate in individual years. In contrast, the r2 values were 50% or higher in 25 of 34 sites when combining the data across years, suggesting that the proposed linear response adequately described the variation due to site x cultivar class interaction at most test sites.
Practical Implications
Our study has several important implications for current cultivar testing efforts with field pea and other crops in Alberta and elsewhere. First, under the current system, yield data from cultivar trials are summarized according to geographic regions delineated for each crop. Cultivars with the highest regional averages are recommended to local producers with little regard to the fact that not all sites in a region are capable of the same level of production (Fig. 1). This geography-based approach would have failed to identify the cultivars that are best adapted to good or bad environments because of the masking effect of taking averages over high and low yielding sites and/or years (Helm et al., 2002). There are earlier attempts to amalgamate similar environments through the cluster analysis (e.g., Horner and Frey, 1957; Abou-El-Fittouh et al., 1969; Ghaderi et al., 1980; Brown et al., 1983; Collaku et al., 2002), but they give no objective criterion for determining the number of groups within which sites are similar in yielding ability or other agronomic and production characteristics. The criteria developed by Crossa and Cornelius (1997) and Russell et al. (2003) are based primarily on whether or not crossover interactions are minimized among sites within a group, but with little regard to site performances within the group. While such grouping certainly helps plant breeders to identify cultivars with wide adaptability, it is of limited value to producers whose objective is to find the best possible match-up of cultivars with production levels of their farm fields.
Second, most studies on GEIs have been limited to examining cultivar x site interactions from combined analysis of cultivar trials in a single year. The clustering of sites based on the data from individual years would be practically significant if the clustered groups are repeatable across years (Lin and Butler, 1990; Russell et al., 2003). However, our study (Table 3) and many other studies (e.g., Lin and Binns, 1994) have shown that there is little consistency of site grouping patterns across years, suggesting the diminutive value of the individual-year analysis. Therefore, we strongly recommend the use of the across-year analyses such as ours. In the past, it has been very difficult to conduct the combined analysis of multiyear data because (i) such data are often unbalanced, so that many statistical analyses developed for balanced data are not readily applicable, and (ii) a site effect in the multiyear data would have two confounded components if site x year interaction is ignored: a predictable part due to fixed soil characteristics and photoperiod at a given site and an unpredictable part due to random year-to-year weather fluctuations. Our proposed normalization procedure has allowed for creating cultivar classes and averaging unbalanced data across years, thereby effectively overcoming the above two difficulties. As a result, we were able to reveal the more meaningful grouping of isoyield sites based on the data averaged across years. It should be noted that this ad hoc procedure for the across-year analysis somewhat differs from the commonly used pattern analysis (e.g., DeLacy and Cooper, 1990; Abdalla et al., 1996; Trethowan et al., 2001). In the pattern analysis, proximities between pairs of sites as measured by square Euclidean distance are calculated for each year and then averaged across years. The site x site matrix of averaged proximities is used for clustering and ordination of sites in the three-way table of year x site x cultivar. For our field pea data, such averaged distances across years were substantially greater than the ones in some years apparently due to considerable year-to-year variation in the distances between a given pair of sites (results not presented). Consequently, with this elevation in the bottom-line distance between the sites, each individual site became a distinct isoyield group according to the F test.
Third, in Alberta and elsewhere, there is a consistent request for improving the quality and efficiency of the cultivar testing. In any case, it is imperative to provide some basis for identifying a few representative test sites. The number of isoyield groups identified in our study suggests a minimum number of sites that would be needed for the future testing. For our field pea data, such numbers were 6 in 1997; 10 in 1998, 2000, and 2001; 12 in 1999; and 7 for across-year data. However, because the true site effect in individual years was confounded with random year-to-year variation, and because grouping patterns varied from year to year, the number determined from averaged site effects based on the across-year analysis (seven sites) is probably more reflective of true differences among sites. To help determine which site would be selected from each isoyield group, we found it is useful to examine the stability statistics (the b values). Appealing to the interpretation by Finlay and Wilkinson (1963) for cultivar stability, we offer the following considerations when selecting test sites from isoyield groups: (i) a site with the b value close to unity would have average stability, but it would be considered as a good site if it appears in a high-yielding isoyield group and as a bad site if it appears in a low-yielding group; (ii) a site with the b value increasing above unity would have below-average stability, but it would be a good site for high-yielding cultivars and a bad site for average and low yielding cultivars; and (iii) a site with the b value decreasing below unity would have above-average stability, but it would be a good site for low-yielding cultivars and a bad site for high-yielding cultivars.
| CONCLUSIONS |
|---|
|
|
|---|
| ACKNOWLEDGMENTS |
|---|
Received for publication January 27, 2004.
| REFERENCES |
|---|
|
|
|---|
Related articles in Crop Science:
This article has been cited by other articles:
![]() |
R.-C. Yang Mixed-Model Analysis of Crossover Genotype-Environment Interactions Crop Sci., May 31, 2007; 47(3): 1051 - 1062. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Navabi, R.-C. Yang, J. Helm, and D. M. Spaner Can Spring Wheat-Growing Megaenvironments in the Northern Great Plains Be Dissected for Representative Locations or Niche-Adapted Genotypes? Crop Sci., March 27, 2006; 46(3): 1107 - 1116. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. B. Blanche and G. O. Myers Identifying Discriminating Locations for Cultivar Selection in Louisiana Crop Sci., February 24, 2006; 46(2): 946 - 949. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| The SCI Journals | Agronomy Journal | Vadose Zone Journal | |||
| Journal of Plant Registrations | Soil Science Society of America Journal | ||||
| Journal of Natural Resources and Life Sciences Education |
Journal of Environmental Quality |
||||