Crop Science 42:76-89 (2002)
© 2002 Crop Science Society of America
CROP BREEDING, GENETICS & CYTOLOGY
Repeatability of Model Genetic Coefficients Derived from Soybean Performance Trials across Different States
T. Mavromatisa,
K. J. Booteb,
J. W. Jones*,a,
G. G. Wilkersonc and
G. Hoogenboomd
a Dep. of Agricultural and Biological Engineering, Univ. of Florida, Gainesville, FL 32611
b Dep. of Agronomy, Univ. of Florida, Gainesville, FL 32611
c Crop Science Dep., North Carolina State Univ., Raleigh, NC 27695
d Dep. of Biological and Agricultural Engineering, Univ. of Georgia, 30223
* Corresponding author (jwj{at}agen.ufl.edu)
 |
ABSTRACT
|
|---|
Crop model testing in diverse environments is essential if modelers wish to make applications or extrapolations to those environments. A recent study demonstrated the effectiveness of optimization techniques for deriving cultivar coefficients for the CROPGRO-Soybean model from typical information provided by soybean performance tests. The objectives of this study were (i) to explore the extent to which cultivar coefficients developed by these approaches from crop performance tests are stable across different regions, (ii) to test the CROPGRO-Soybean model's ability to predict phenology and seed yield using cultivar coefficients that were developed in different regions, and (iii) to investigate whether 3 yr of crop performance data are adequate for developing stable genetic coefficients. A stepwise procedure was applied to derive cultivar coefficients for 10 common cultivars grown in different environments in Georgia and North Carolina. Regarding the transportability of cultivar coefficients across states, we found that the critical daylength coefficients were the most reliable cultivar traits. We found less stability of the cultivar traits that control genetic differences in seed yield potential. The estimated cultivar coefficients developed in Georgia enabled CROPGRO to predict yield and harvest maturity in North Carolina within 3.8% and 3.5 d, respectively, from the observed averages. Using the cultivar coefficients developed from North Carolina environments allowed us to simulate the actual mean yield and harvest maturity in Georgia to within 2.5% and 2.0 d. Furthermore, the model's ability to predict seed yield and maturity with cultivar coefficients developed from 3 yr of data was nearly as good as that derived from much larger data sets.
Abbreviations: DOY, day of year
 |
INTRODUCTION
|
|---|
DYNAMIC, PROCESS-LEVEL crop models are playing an increasing role in analyzing the interactions and contributions of environmental factors (such as temperature, daylength, and water supply), genetics, and physiology on seed yield. CROPGRO-Soybean V 3.5 is a process-oriented model (Boote et al., 1998; Hoogenboom et al., 1994) that simulates carbon, water, and nitrogen balance for the soybean [Glycine max (L.) Merr.] crop and soil under different climate, soil, and management conditions. The model simulates the mass of roots, stems, leaves, seeds, and pod walls, as well as leaf area, root growth, soil water extraction, and plant water deficit on a daily basis during a growing season. CROPGRO has been used for many purposes (Boote et al., 1996). This model incorporates knowledge of cultivar-specific traits to predict daily growth and development as the plant responds to weather, soil characteristics, and management practices (Boote et al., 1998). These cultivar-specific traits are referred to as genetic coefficients (Table 1).
View this table:
[in this window]
[in a new window]
|
Table 1. Definition of selected cultivar and soil traits for the CROPGRO-Soybean model that were determined by the optimization procedure.
|
|
The estimation of genetic and soil parameters can be a tedious, time-consuming process that can be done manually by adjusting some of the parameters so that predicted data fit observations (Boote et al., 1997; Colson et al., 1995) or by fitting different statistical models (linear or exponential) to measured data (Dardanelli et al., 1997). A third approach for model parameter estimation, which is more systematic and objective, is the use of optimization techniques. A predefined objective function is either maximized or minimized during these approaches. Several studies have used optimization procedures to derive soybean cultivar coefficients for predicting flowering date (Grimm et al., 1993) and the occurrence of reproductive stages after flowering (Grimm et al., 1994), for improving crop models (Piper et al., 1998), and for estimating soil and root growth parameters (Calmon et al., 1999a, b) by minimizing the error sum of squares between observations and predictions.
A recent study by Mavromatis et al. (2001) described an approach to estimate soil and cultivar coefficients for CROPGRO-soybean from typical information (such as anthesis and harvest maturity dates, final seed yield, seed size, canopy height, and lodging) provided from crop performance trials conducted at experimental stations in Georgia. In addition to deriving useful information on site characteristics and cultivar traits, this approach enabled the crop model to mimic satisfactorily the genotypic yield ranking and much of observed genotype x environment interactions. A study by Irmak et al. (2000) confirmed the value of the same approach by cross validation and demonstrated the ability of CROPGRO-Soybean for predicting anthesis, maturity, and yield in Georgia using independent data from the same sites but different years. However, no tests have yet been performed using cultivar-specific coefficients developed from different regions. Furthermore, problems with heat unit sums and incorrect daylength or temperature sensitivities in a model may not show up if the model is run in similar field environments, but they can create major effects if the model is tested in diverse regions that are cooler or warmer (Boote et al., 1996; Piper et al., 1998).
Successful use of crop models in technology transfer and decision support tools requires that coefficients describing new cultivars be available as soon as the cultivars are marketed. Presently, crop performance tests are the only readily available data for many new cultivars when they are released, whether done in-house by private companies or by statepublic testing. In most cases, only 2 or 3 yr of crop performance tests are conducted before the new cultivars are released.
Therefore, the objectives of this paper were (i) to explore the extent to which cultivar coefficients from crop performance tests are stable across different regions (Georgia and North Carolina), (ii) to test CROPGRO's ability to predict phenology and seed yield using cultivar coefficients that were developed in different regions, and (iii) to investigate whether 3 yr of crop performance data are adequate for developing stable genetic coefficients.
 |
MATERIALS AND METHODS
|
|---|
Yield Trial Data
Yield trial data from Georgia were obtained from the Field crop performance tests: soybean, peanut (Arachis hypogaea L.), cotton (Gossypium spp.), tobacco (Nicotiana tabacum L.), sorghum [Sorghum bicolor (L.) Moench], and summer annual forages research reports from the Georgia Experiment Stations from 1987 to 1996 (Raymer et al., 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988). We will refer to these crop performance tests as yield trials from now on. The yield trial data include observations of harvest maturity date (day of year, DOY), seed yield (kg ha-1), seed size (mg), and sometimes flowering date (DOY). These trials were conducted by sowing sets of 28 to 115 cultivars over 4 to 10 yr at five Georgia locations [Tifton (31.5° N, 83.5° W), Plains (32° N, 84.3° W), Midville (32.9° N, 82.2° W), Griffin (33.2° N, 84.7° W), and Calhoun (34.3° N, 85.1° W)] ranging in elevation from 150 to 267 m. Only the rainfed plantings were used, since insufficient irrigation information was available for the irrigated trials. A total of 2176 cultivar x site x year combinations divided between early and late plantings were used.
Yield trial data from North Carolina were obtained from the corn (Zea mays L.), corn silage, grain sorghum, and soybean crop performance trials conducted from 1983 to 1993 (Bowman, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983) and the soybean and cotton trials conducted from 1994 to 1998 (Bowman, 1998, 1997, 1996, 1995, 1994). The data set from North Carolina was much larger in terms of cultivars used (202 compared with less than 100 in Georgia), and site x years environments (80 compared with 35 in Georgia). The yield trial data included observations of seed yield (kg ha-1), seed size (mg), and sometimes harvest maturity date (DOY). These trials were conducted at eight locations in North Carolina [Bertie (36.1° N, 77.2° W), Colombus (34.4° N, 78.8° W), Edgecombe (35.9° N, 77.7° W), Lenoir (35.4° N, 77.5° W), Rowan (35.7° N, 80.6° W), Stanly (35.7° N, 80.6° W), Washington (35.9° N, 76.6° W), and Wilson (35.7° N, 77.9° W)] ranging in altitude from 6 to 251 m. A total of 7024 cultivar x site x year combinations divided between early and late plantings were used.
For our study, observed yield data were decreased 13% (based on actual moisture content) to compare with simulated dry weight yields. Ten common cultivars from both data sets were selected: Perrin, Stonewall, Hagood, Cook, Thomas, Colquitt, Brim, Young, Hutcheson, and Deltapine 105 (Table 2). A total of 347 and 532 cultivar x environments combinations were used to estimate cultivar coefficients in Georgia and North Carolina, respectively. The weather during the growing seasons varied with location (Table 3).
View this table:
[in this window]
[in a new window]
|
Table 3. Soil type, mean total precipitation (Prec.), and air temperature (TEMP) for the growing season (May to October) for soybean cultivars at five and eight locations in Georgia and North Carolina, respectively. The number of growing seasons for each location is shown within parentheses.
|
|
CROPGRO Model Inputs
Daily weather data of total solar radiation, daily maximum and minimum air temperature, and precipitation are required by CROPGRO. Weather data for each site in Georgia were obtained from the Georgia Automated Environmental Monitoring Network (Hoogenboom, 1996; Hoogenboom and Gresham, 1997). For North Carolina, daily weather data were obtained from the State Climate Office of North Carolina (Box 7236, Raleigh, NC 27695). For trials that were conducted on a North Carolina agricultural research station, data from that station were used. For trials that were conducted off-station, weather data from the closest recording station (within 16 km from the test locations) were used.
CROPGRO inputs also include planting date, population, row spacing, and initial soil water content. The initial soil water content at sowing date was set to field capacity for all site-years. Therefore, model errors may have occurred in some cases where the initial soil water content was too high for some locations in some years. Crops at all locations in Georgia were sown in rows 0.76 m apart at a density of 34 plants m-2. In North Carolina a density of 30 plants m-2 was set but the row spacing varied between 0.19 and 0.97 m. The effects of tillage, pests, and diseases were not directly considered in our simulations.
CROPGRO uses a number of cultivar-specific parameters (genetic coefficients) to predict soybean daily growth and development in response to weather, soil characteristics, and management (Boote et al., 1998). The genetic coefficients (Table 1) describe (i) the cultivar sensitivity to daylength (CSDL), (ii) durations of life cycle phases (seed to physiological maturity, SDPM), (iii) vegetative growth traits (light-saturated leaf photosynthesis rate, LFMAX), and (iv) reproductive growth traits (potential seed size). The life cycle phase coefficients (emergence to flowering, flowering to beginning seed, and beginning seed to physiological maturity) relate to life cycle timing and are measured in photothermal days. The latter is a unit that combines the standard concept of degree-days with a measure of daylength. Most cultivar coefficients are generally similar for cultivars within a maturity group (Boote et al., 1997). This provides a starting point, as approximate values are known for all maturity groups (Grimm et al., 1993, 1994; Boote et al., 1997). The generic values of cultivar traits for maturity groups V to VIII were used in this study (Table 4). However, individual cultivars frequently vary from maturity group norms. This, along with site-specific and year-specific environmental variation, results in variation in cultivar performance over locations and years.
View this table:
[in this window]
[in a new window]
|
Table 4. Generic coefficient values of CSDL, FLSH, FLSD, SDPM, R1PPO, SFDUR, PODUR, LFMAX, and THRESH for soybean maturity groups V to VIII.
|
|
For each site-year, the most common soil types were identified from soil surveys (Table 3). Soil types and families at the locations in Georgia were described by Mavromatis et al. (2001). In North Carolina, soil series and soil surface texture information for each field used in the crop performance trials was included in the summary reports (Bowman, 1998, 1997, 1996, 1995, 1994). At Bertie, for example, trials were performed in fields with soils belonging to three different soil series from 1992 to 1998. The soil characteristics for each soil series and surface texture were used to calculate the soil physical and chemical parameters required to run the CROPGRO model (Ritchie et al., 1989; Tsuji et al., 1994).
Estimating Cultivar Coefficients and Soil Parameters
The genetic coefficients and soil parameters required by the model were estimated with a stepwise procedure similar to that employed by Mavromatis et al. (2001) as follows: (i) candidate coefficientsparameters were selected; (ii) the values of the coefficientsparameters were changed by running CROPGRO in an optimization shell until the error sum of squares (simulated minus observed) was minimized; and (iii) the set of coefficientsparameters that produced the lowest root mean square error (RMSE) was adopted. The success of this procedure was shown by a reduction in RMSE from one step to the next. The optimizations were done by two-dimensional linear grid searches.
Since we wanted the estimation of soil parameters for each site-year for each state to be as independent as possible from the estimation of cultivar coefficients, we optimized the soil fertility factor (SLPF) and the soil water holding limits (DULLL) (Table 1) that best predicted mean measured yield for each site-year using all cultivars except the selected ones (n = 10) (Table 2). In CROPGRO, SLPF is an input variable (constant for a given field site) that affects biomass growth rate by modifying daily canopy photosynthesis. SLPF is attributed to soil fertility differences or soil-based pests, such as nematodes. A two-dimensional linear grid search was used to find the combination of SLPF and (DULLL) that minimized the sum of squares of the errors between simulated and observed seed yield. For one direction of the search, a pseudo variable shifted DUL and SAT in each soil layer together for the specific soils over the depth of the soil profile. For the other search dimension, SLPF was allowed to vary within a range supported by literature (Jones et al., 1989). The optimized soil traits were used with the 10 selected cultivars for the rest of the study. The exact values for SLPF and (DULLL) were not of direct relevance here, rather the simulating of mean site yield was the goal.
In step two, for each cultivar over all locations, years, and planting dates, the observed maturity date was fit by means of a two-way linear grid search to minimize the error sum of squares between simulated and observed harvest maturity. For one direction of the search, a pseudo variable x (ranging from 1 to +1) shifted (FLSD + SDPM) together within a range supported by previous work (Piper et al., 1996). In the second direction of the search, we varied CSDL with R1PPO linked to CSDL (Table 1). R1PPO acts after anthesis and decreases CSDL by 0 to 1 h, making plant development more sensitive to photoperiod after anthesis. Piper et al. (1996) provided evidence that allowing shorter CSDL after R1 often resulted in a better fit of observed phenology for maturity group cultivars later than MG III. CSDL was allowed to change within the range of the same maturity group to which the cultivars were assigned (Table 2). Once we had fit both (FLSD + SDPM), FLSH was set proportionally to FLSD as in Mavromatis et al. (2001).
In the third step, for each cultivar over all locations, years, and planting dates, a two-way linear grid search was used to minimize the error sum of squares between observed and simulated seed yield in to account for any remaining differences in yield among cultivars across sites that would be attributed to cultivar traits other than effects via maturity date. The same approach as in Mavromatis et al. (2001) was adopted. In one search direction, we created a productivity pseudo variable X1 that shifted LFMAX and THRESH (Table 1) together. For the second search dimension, a second pseudo variable X2 (ranging from 1 to +1) was developed that jointly shifted cultivar internal life cycle coefficients (FLSH, FLSD, SDPM, SFDUR, and PODUR) that affected seed yield by starting pod and seed growth sooner or later within the previously fixed time from anthesis to maturity (Table 1). The latter traits act to change harvest index but not total biomass. The shifts were all made together in a fashion that caused minimal changes in maturity date since we did not want to disturb the cultivar life cycle length that we had already estimated.
Testing Stability of Cultivar Coefficients
First, we compared the two sets of cultivar coefficients (one for each state) with each other to investigate the extent to which they were repeatable (or stable). The cultivar coefficients for each of the 10 selected cultivars were then used with CROPGRO to predict the actual maturity and yield for the state where they were developed. Then we applied the cultivar coefficients estimated at Georgia to predict the observed yields and maturity dates in North Carolina and vice-versa.
The calibration procedure of estimating cultivar coefficients for the selected cultivars was repeated with only 3 yr of crop performance data (from 1992 to 1994) from the Georgia data set. This period was selected because it was the only common one for all cultivars. The 3-yr coefficients were compared with those developed from the whole Georgia data set. We then evaluated the coefficients from calibration by applying them with the crop model to predict the actual yield and harvest maturity for the remaining part of the Georgia data set that was not used for the development of cultivar coefficients (19871991 and 19951996). We were unable to apply the same procedure for the North Carolina data set because of a limited number of maturity observations.
 |
RESULTS AND DISCUSSION
|
|---|
Comparison of Cultivar Coefficients across States
One of the underlying hypotheses of crop simulation models is that genetic coefficients are constant or stable across multiple environments. The regression analysis between the estimated CSDL values for the 10 cultivars showed a very close agreement between the values solved in North Carolina compared with values solved independently in Georgia (r2 = 0.96), also evidenced by an intercept near zero and a slope very close to unity (Fig. 1a) . That suggests that the critical daylength sensitivity can be estimated in one state where data are available and is equally valid in other states. Optimized CSDL values for MG VII cultivars in both states were within the typical MG range (Table 4) and were generally smaller than the values for MG V and VI (Table 5). The search algorithm reached the minimum allowed CSDL for Thomas in both states and suggested a maturity group shift for this cultivar towards later maturity group cultivars. However, a single maturity group shift should not be of concern since many times private companies assign soybean cultivars to maturity groups for marketing reasons. R1PPO was coupled with CSDL and thus increased as CSDL and MG increased (Table 5).

View larger version (21K):
[in this window]
[in a new window]
|
Fig. 1. Comparison of critical daylengths (CSDL) (a), the time from first flower to physiological maturity (FLPM) (b), and leaf photosynthesis rates (LFMAX) (c), estimated for 10 soybean cultivars in Georgia and North Carolina. The 1:1 line (dot) and the regression line (solid) between the data are also shown.
|
|
View this table:
[in this window]
[in a new window]
|
Table 5. Estimates for each soybean cultivar of CSDL, FLSH, FLSD, SDPM, R1PPO, SFDUR, PODUR, LFMAX, and THRESH derived separately with data from Georgia and North Carolina.
|
|
The time intervals FLSD and SDPM, on the other hand, were somewhat smaller than typical generic MG values (Tables 4 and 5). This was more evident for Georgia than North Carolina. The sum of these two periods (FLPM = FLSD + SDPM) for each cultivar is plotted as values solved in North Carolina vs. those solved in Georgia (Fig. 1b). Although the r2 of the regression analysis was not as high as for CSDL (0.66 versus 0.96), there was still good repeatability between values in the two states. The values for time between first flower and physiological maturity (FLPM) for North Carolina were consistently higher than for Georgia, a situation we suspect is related to the lower temperatures in North Carolina (Table 3) and possibly incorrect model temperature sensitivity during seed fill.
Variation in LFMAX (part of the X1 variable) (Table 5) was generally within the expected genetic range reported by Boote and Tollenaar (1994) (from 0.821.39 mg CO2 m-2 s-1). The regression analysis between the estimated LFMAX values for the 10 cultivars showed a weak agreement, as evidenced by the low r2 = 0.19 between the values solved in North Carolina compared with values solved in Georgia (Fig. 1c). LFMAX for Colquitt and Hutcheson was much higher in Georgia than in North Carolina. The opposite was found for Thomas. THRESH did not deviate much from the maturity group norms (Table 4) and it never attained its upper or lower limits (80.5 and 75.5). Since LFMAX and THRESH were shifted together (X1 variable) both moved toward relatively high or low values at the same time. We have more confidence in the cultivar shifts in LFMAX and we do not imply any true linkage between those two coefficients. The r2 was even lower (r2 = 0.04) between the values of seed filling durations (SFDUR) (part of the X2 suite) for the 10 cultivars developed in the two states. These results suggest that less confidence should be placed on the repeatability of the solved cultivar traits that affect seed yield (X1 and X2) compared with the cultivar traits that control the individual phases of the life cycle (FLSD and SDPM) and the CSDL.
To attain high yield, the high yielding cultivars should be characterized by high values for X1 (LFMAX and THRESH) or high values for X2 (FLSH, FLSD, SDPM, SFDUR, and PODUR) or moderately high values for both. It is important to add that fitting the cultivar coefficients to observed yield was designed to explain only that part of the observed yield variation that was not accounted for by the site characteristics or life cycle traits. The observed yields for the 10 cultivars in Georgia and North Carolina were achieved in different ways for the modeled cultivars following our approach in terms of LFMAX (part of the X1 variable) and SFDUR (part of the X2 variable) (Fig. 2)
. In Georgia, the three highest yielding cultivars (Brim, Cook, and Hutcheson) had relatively high LFMAX (higher than the generic values) and moderate to high SFDUR. The lowest yielding cultivars (Perrin, DP 105, and Thomas) had lower than the typical LFMAX values. Young was an exception, since it achieved high yield with the longest allowed seed filling duration and low LFMAX. In that case, the response of the simulated yield to long seed filling duration offset the effects of the low photosynthetic rates. In North Carolina, two of the high yielding cultivars (Brim and Cook) had higher than typical LFMAX values while moderately high yield cultivars (DP105, Young, and Hutcheson) had only moderate LFMAX values (Fig. 2).

View larger version (22K):
[in this window]
[in a new window]
|
Fig. 2. Leaf photosynthesis rate (LFMAX) and seed filling duration (SFDUR) estimated for 10 soybean cultivars in Georgia (a) and North Carolina (b). The line shows the generic value for LFMAX. The bubble-size for each cultivar is based on the observed mean yield. The initials for each cultivar are also shown inside the bubbles (P for Perrin, C for Cook, S for Stonewall, Ha for Hagood, T for Thomas, col for Colquitt, B for Brim, Y for Young, Hu for Hutcheson, D for DP 105).
|
|
A regression analysis was conducted to compare the proportion of the observed mean cultivar yield variability explained by LFMAX (Fig. 3)
and seed filling duration (SFDUR) (data not shown) of the 10 cultivars in both states. For Georgia, LFMAX accounted for a much larger proportion of the cultivar yield variability as evidenced from the r2 (Fig. 3) compared with SFDUR (72 vs. 7%). For North Carolina data, also LFMAX (X1) explained a larger percentage of the cultivar yield variance than did SFDUR (61 vs. 7%) (Fig. 3). Seed filling duration, on the other hand, was not a good criterion to discriminate between high and low yielding cultivars since it accounted for only 7% of the observed cultivar yield variability in either state.

View larger version (23K):
[in this window]
[in a new window]
|
Fig. 3. Leaf photosynthesis rate (LFMAX) versus observed mean cultivar yield for 10 soybean cultivars in Georgia and North Carolina. Regression lines between the data are also shown.
|
|
There are many reasons why our approach did not follow a defined pathway where yield differences between cultivars in both states were not always attributed to the same increases or decreases in solved genetic coefficients. Yield is a complex integral of all the crop processes occurring from sowing to harvest, including not just physiological gain of assimilates but also pest tolerances. First, the typical information recorded in crop performance trials was too limiting for the characterization of cultivars for various yield-influencing traits beyond yield itself and maturity date. Knowledge about harvest index, dry matter accumulation potential (leaf photosynthesis), and parent-progeny relationships among cultivars could help guide an improved parameter search. Second, the two datasets from Georgia and North Carolina were not 100% orthogonal for all sites and years, which made the comparison of cultivar coefficients more difficult (Table 3). Third, the CSDL can cause faster or slower actual progress despite a long or short SDPM and SFDUR, thus absolute relationship to SFDUR should not be expected. Last but not least, our procedure adopts at the end, a set of coefficientsparameters that produced the lowest root mean square error (RMSE) between observed and simulated data. There is an additional small cloud of RMSE points around the lowest RMSE, each of which represents a different combination of coefficientsparameters. With only slightly larger RMSE, other combinations of coefficients could have been accepted, if we had more knowledge based on parent-progeny relationships or repeatability from state to state to place minor restrictions on the trait.
Evaluating Predictions with Coefficients Developed within Regions
When CROPGRO was run in Georgia with the coefficients developed in Georgia, the simulated maturity at harvest for most of the cultivars was within 1 d, on average, of the observed dates (Table 6). The index of agreement (d) (Willmot, 1982), RMSE and r2 between observed and simulated maturity dates was higher for group VII cultivars than for V and VI. The mean yield difference for Georgia varied on average from -0.8% for DP105 to +2% for Thomas. When CROPGRO was run in North Carolina with the coefficients developed on-site, the model was able to reproduce the observed harvest maturity equally well although the sample size for some cultivars was very small (Table 6). The index of agreement and r2 between observed and simulated yield were somewhat lower than the respective values found for the same cultivars in Georgia. In addition, the model reproduced correctly the observed yield-based genotype ranking among cultivars for both states (RK2 and RK4 in Table 7), although the ranking placement was different for the same cultivars in the two states.
View this table:
[in this window]
[in a new window]
|
Table 6. Simulated (Sim.) harvest maturity and yield for each soybean cultivar and for all cultivars (Mean) in Georgia and North Carolina with the coefficients developed in Georgia and North Carolina, respectively.
|
|
View this table:
[in this window]
[in a new window]
|
Table 7. Observed yield (Obs.), observed yield ranking (RK1) and simulated rankings (RK2RK5) for ten soybean cultivars in Georgia and North Carolina.
|
|
The regression analysis between observed and simulated seed yield for Georgia (Fig. 4a)
showed a tendency of the crop model to overestimate the lowest yield observations and underestimate the highest. This tendency can be partly explained by the fact that CROPGRO also overestimated and underestimated the shortest and longest life cycles, respectively (Fig. 4b). We might have been able to avoid these tendencies if we had used a weighted least square search instead of least squares search. We reached the same conclusions when we plotted measured and predicted yield and life cycles for North Carolina (Table 6). The fit to maturity, despite the small sample sizes, was somewhat better for North Carolina than for Georgia as evidenced by the lower RMSE, higher d, and r2 (Table 6). The yield fits, on the other hand, were significantly better for Georgia.

View larger version (38K):
[in this window]
[in a new window]
|
Fig. 4. Comparison of simulated versus observed seed yield (a) and harvest maturity (b) for 10 soybean cultivars in Georgia with the coefficients developed in Georgia. The 1:1 line (dot) and the regression line (solid) between the data are also shown.
|
|
Evaluating Predictions with Coefficients Developed in Different Regions
The real test is how the model will predict yield and maturity in one region with cultivar traits developed in a different region. The regression of the predicted yield and maturity against the observations from Georgia in which the coefficients developed in North Carolina were used, and vice-versa, demonstrated the model's ability to predict maturity and yield nearly as well as when the local cultivar coefficients were used, as evidenced by the RMSE and d (Tables 6 and 8), the intercept, slopes, and r2 (Fig. 5 and 6)
. CROPGRO was able to explain virtually the same percentage of the observed variability in yield (75% for Georgia and 63% for Carolina) (Fig. 5a and 6a) and maturity (87% for Georgia and 89% for North Carolina) (Fig. 5b and 6b) using cultivar coefficients from the other state rather than the local traits (Table 6). The percentage of variation accounted for by the model decreased only about 1%, when out-of-region cultivar coefficients were used.
View this table:
[in this window]
[in a new window]
|
Table 8. Simulated (Sim.) harvest maturity and yield for each soybean cultivar and for all cultivars (Mean) in Georgia and North Carolina with the coefficients developed in North Carolina and Georgia, respectively.
|
|

View larger version (38K):
[in this window]
[in a new window]
|
Fig. 5. Comparison of simulated versus observed seed yield (a) and harvest maturity (b) for 10 soybean cultivars in Georgia with the coefficients developed in North Carolina. The 1:1 line (dot) and the regression line (solid) between the data are also shown.
|
|

View larger version (37K):
[in this window]
[in a new window]
|
Fig. 6. Comparison of simulated versus observed seed yield (a) and harvest maturity (b) for 10 soybean cultivars in North Carolina with the coefficients developed in Georgia. The 1:1 line (dot) and the regression line (solid) between the data are also shown.
|
|
The same conclusion was reached by comparing the slopes and intercepts of the regression lines (Fig. 5 and 6). We estimated that solving for cultivar traits decreased the RMSE between observed and simulated yields by about 1 to 14% in Georgia and by 3 to 17% in North Carolina, depending on the cultivar, compared with the RMSE estimated after the estimation of site characteristics. It is important to emphasize that site traits (SLPF, DUL-LL, weather) accounted for much more yield variation than did cultivar traits (66 and 73% of the observed yield variation in North Carolina and Georgia, respectively, compared with 3 and 4%, respectively, for cultivar variation).
By using the cultivar coefficients that were developed in opposite states, rather than locally derived cultivar traits, the crop model overpredicted the actual mean yield in Georgia by only 2.5% and underpredicted by 3.8% in North Carolina (Table 8), and the RMSE of yield predictions increased by 7.3% in Georgia (from 358384 kg ha-1) and by 10.8% in North Carolina (from 443491 kg ha-1). The RMSE of harvest maturity also slightly increased by 0.5 d in Georgia (from 5.56 d) and by 1.2 d in North Carolina (from 5.26.4 d). The index of agreement for yield and maturity was virtually the same using the cultivar traits developed across states (Tables 6 and 8). In North Carolina, CROPGRO consistently predicted earlier maturities and underestimated the mean actual yield for eight of 10 cultivars. In Georgia, the use of cultivar traits from North Carolina had the opposite effects on harvest maturity dates and final yield.
The crop model predicted substantial agreement to genotype rankings, although it did not always predict exact ranking of the 10 cultivars (the r2 of rankings was 0.67 in Georgia and 0.65 in North Carolina) (Table 7). This may be of less concern given the very small yield differences among cultivars (Table 2). Furthermore, CROPGRO was still able to separate the high from the low yielding cultivars (as evidenced by the small differences in ranking).
Stability of Cultivar Coefficients Using 3 yr of Data
The regression analysis between the CSDL coefficients estimated for Georgia on the basis of only 3 yr of data from 1992 to 1994 (Table 9) and those estimated on the basis of the whole data set from 1987 to 1996 (Table 5) showed a very close agreement (r2 = 0.97) with a low intercept and a slope very close to unity. On the other hand, FLPM (part of the X2 variable) was somewhat higher for the 3-yr period since the actual life cycle during 1992 to 1994 was 3.6 d longer on average. Although the r2 = 0.75 of the regression analysis for FLPM was not as high as for CSDL, the regression line was virtually parallel to the 1:1 line. The estimated LFMAX values (part of the X1 variable) were similar for six out of 10 cultivars. The estimated LFMAX values from the 3-yr period were significantly higher for DP105 and Hagood (Tables 5 and 9). The life cycle for DP105 during 1992 to 1994 was longer by a week compared with the baseline period and the similar seed yield resulted in higher X1 that counteracted the effects of the shorter seed filling period (lower X2). In the case of Hagood, however, the search algorithm reached a somewhat lower X2 and significantly higher X1 despite the similar actual maturity and yield compared with the baseline period. When we reran Hagood during 1992 to 1994 using the coefficients we had developed from the baseline period, the simulated maturity and yield were equally close to the actual means but the RMSE were slightly higher. Therefore, in the case of Hagood the strict objective function we used (minimization of RMSE) resulted in much different cultivar traits than a more relaxed function would have produced.
View this table:
[in this window]
[in a new window]
|
Table 9. Estimates for each soybean cultivar of CSDL, FLSH, FLSD, SDPM, R1PPO, SFDUR, PODUR, LFMAX, and THRESH using crop performance data from Georgia 1992 to 1994.
|
|
The model's ability to predict maturity for the validation period with the coefficients developed from the 1992 to 1994 data was good as evidenced by the r2, RMSE, and d (Tables 6 and 10) compared with results based on the coefficients developed from the whole data set. CROPGRO was also effective in reproducing the mean yield despite an increase in RMSE from 358 to 423 kg ha-1 and the slightly lower r2 and d values.
View this table:
[in this window]
[in a new window]
|
Table 10. Comparison between simulated (Sim.) and observed (Obs.) harvest maturity and yield for each soybean cultivar and for all cultivars (Mean) for the validation period (1987 to 1991 and 1995 to 1996) in Georgia.
|
|
 |
CONCLUSIONS
|
|---|
Our results confirmed the conclusions of the recent studies (Mavromatis et al., 2001; Irmak et al., 2000) and demonstrated that a large number of yield trials representing different environments can be successfully used to derive cultivar information for use in crop models. With typical information provided by crop performance tests and a systematic approach, it is possible to estimate and provide coefficients describing new cultivars before they are made available to growers for commercial production.
The optimization procedure we applied for site and cultivar (maturity and yield-influencing) traits produced reliable cultivar coefficients and site characteristics that when applied in-state enabled the CROPGRO model to reproduce successfully the observed yield-based cultivar ranking across environments even when the site characteristics were estimated independently. The use of the estimated cultivar coefficients in different environments from where they were developed also resulted in accurate predictions of observed maturity and yield (as evidenced by the close agreement in terms of mean, d, and RMSE). Furthermore, the model's ability to predict seed yield and harvest maturity with cultivar coefficients estimated from only 3 yr of data was nearly as good as that resulting from a much larger number of environments.
Regarding the transportability of cultivar coefficient estimates in different environments, we identified the CSDL as the most reliably estimated cultivar trait. We have somewhat less confidence for the stability across environments of the other estimated cultivar traits that control the individual phases of the life cycle (FLSD and SDPM). The longer SDPM in North Carolina compared with Georgia with similar CSDL argues that the model's temperature function for the phase SDPM may be incorrect. Slightly slower development during SDPM under the cooler temperatures in North Carolina is needed to allow a shorter SDPM. Less confidence should be placed on the stability of estimated cultivar traits that affect seed yield (X1 and X2), as contrasted to those affecting maturity. Generally, higher yielding cultivars solve out for higher LFMAX in both regions or sometimes moderate LFMAX with longer SFDUR. However, we identified cases where our technique reached combinations of cultivar coefficients very different from one region to another. In that case, additional information that is not currently recorded in crop performance trials such as harvest index, dry matter accumulation, and parent-progeny relationships could help guide a more improved trait search that would lead to more realistic sets of cultivar coefficients. A part of the problem may be due to the fact that we started with nine cultivar coefficients and linked the yield coefficients into two yield traits plus two phenology traits. Soybean cultivars are more complex than this, but present yield trials constrain information to measured yield and maturity only. We are currently working on a more relaxed objective function that will allow us to identify more realistic cultivar coefficients even at the expense of slightly higher RMSE. Despite reaching different combinations of those traits, the yield rankings of cultivars in a different region were generally acceptably close.
 |
NOTES
|
|---|
Florida Agricultural Experiment Station, Journal Series No. R-07981.
Received for publication February 22, 2001.
 |
REFERENCES
|
|---|
- Boote, K.J., J.W. Jones, and G. Hoogenboom. 1998. Simulation of crop growth: CROPGRO Model. p. 651692. In R.M. Peart and R.B. Curry (ed.) Agricultural systems modeling and simulation. Marcel Dekker, New York.
- Boote, K.J., J.W. Jones, G. Hoogenboom, and G.G. Wilkerson. 1997. Evaluation of the CROPGRO-soybean model over a wide range of experiments. p. 113133. In Kropff et al. (ed.) Systems approaches for sustainable agricultural development: Applications of systems approaches at the field level. Kluwer Academic Publishers, Boston.
- Boote, K.J., J.W. Jones, and N.B. Pickering. 1996. Potential uses and limitations of crop models. Agron. J. 88:704716.[Abstract/Free Full Text]
- Boote, K.J., and M. Tollenaar. 1994. Modeling genetic yield potential. p. 533565. In K.J. Boote et al. (ed.) Physiology and determination of crop yield. ASA, CSSA, and SSSA, Madison, WI.
- Bowman, D.T. 1983. Measured crop performance: Part IV Soybeans. Research Report No. 94, Dep. of Crop Sci., NCSU, Raleigh, NC.
- Bowman, D.T. 1984. Measured crop performance: Part IV Soybeans. Research Report No. 98, Dep. of Crop Sci., NCSU, Raleigh, NC.
- Bowman, D.T. 1985. Measured crop performance: Part IV Soybeans. Research Report No. 102, Dep. of Crop Sci., NCSU, Raleigh, NC.
- Bowman, D.T. 1986. Measured crop performance: Part IV Soybeans. Research Report No. 106, Dep. of Crop Sci., NCSU, Raleigh, NC.
- Bowman, D.T. 1987. Measured crop performance: Part IV Soybeans. Research Report No. 110, Dep. of Crop Sci., NCSU, Raleigh, NC.
- Bowman, D.T. 1988. Measured crop performance: Part IV Soybeans. Research Report No. 115, Dep. of Crop Sci., NCSU, Raleigh, NC.
- Bowman, D.T. 1989. Measured crop performance: Part IV Soybeans. Research Report No. 120, Dep. of Crop Sci., NCSU, Raleigh, NC.
- Bowman, D.T. 1990. Measured crop performance: Part IV Soybeans. Research Report No. 128, Dep. of Crop Sci., NCSU, Raleigh, NC.
- Bowman, D.T. 1991. Measured crop performance: Part IV Soybeans. Research Report No. 134, Dep. of Crop Sci., NCSU, Raleigh, NC.
- Bowman, D.T. 1992. Measured crop performance: Part IV Soybeans. Research Report No. 139, Dep. of Crop Sci., NCSU, Raleigh, NC.
- Bowman, D.T. 1993. Measured crop performance: Part I Soybeans. Research Report No. 144, Dep. of Crop Sci., NCSU, Raleigh, NC.
- Bowman, D.T. 1994. Measured crop performance: Part I Soybeans. Research Report No. 150, Dep. of Crop Sci., NCSU, Raleigh, NC.
- Bowman, D.T. 1995. North Carolina measured crop performance soybean and cotton 1995. Research Report No. 156, Dep. of Crop Sci., NCSU, Raleigh, NC.
- Bowman, D.T. 1996. North Carolina measured crop performance soybean and cotton 1996. Research Report No. 163, Dep. of Crop Sci., NCSU, Raleigh, NC.
- Bowman, D.T. 1997. North Carolina measured crop performance soybean and cotton 1997. Research Report No. 172, Dep. of Crop Sci., NCSU, Raleigh, NC.
- Bowman, D.T. 1998. North Carolina measured crop performance soybean and cotton 1998. Res. Rept. 176, Crop Science Dep., NCSU, Raleigh, NC.
- Calmon, M.A., W.D. Batchelor, J.W. Jones, J.T. Ritchie, K.J. Boote, and L.C. Hammond. 1999a. Simulating soybean root growth and soil water extraction using a functional crop model. Trans. ASAE 42:18671877.
- Calmon, M.A., J.W. Jones, D. Shinde, and J.E. Specht. 1999b. Estimating parameters for soil water balance models using adaptive simulated annealing. Trans. ASAE 15:703713.
- Colson, J., A. Bouniols, and J.W. Jones. 1995. Soybean reproductive development: Adapting a model for European cultivars. Agron. J. 87:11291139.[Abstract/Free Full Text]
- Dardanelli, J.L., O.A. Bachmeier, R. Sereno, and R. Gil. 1997. Rooting depth and soil water extraction patterns of different crops in a silty Haplustoll. Field Crops Res. 54:2938.
- Grimm, S.S., J.W. Jones, K.J. Boote, and D.C. Herzog. 1994. Modeling the occurrence of reproductive stages after flowering for four soybean cultivars. Agron. J. 86:3138.[Abstract/Free Full Text]
- Grimm, S.S., J.W. Jones, K.J. Boote, and J.D. Hesketh. 1993. Parameter estimation for predicting flowering date of soybean cultivars. Crop Sci. 33:137144.[Abstract/Free Full Text]
- Hoogenboom, G. 1996. The Georgia Automated Environmental Monitoring Network. p. 343346. In Preprints 22nd Conference on Agricultural and Forest Meteorology with Symposium on Fire and Forest Meteorology. American Meteorological Society, Boston, MA.
- Hoogenboom, G., and D.D. Gresham. 1997. Automated weather station network. p. 483486. In K.J. Hatcher (ed.) Proceedings of the 1997 Georgia Water Resources Conference. Institute of Ecology, The University of Georgia, Athens, GA.
- Hoogenboom, G., J.W. Jones, P.W. Wilkens, W.D. Batchelor, W.T. Bowen, L.A. Hunt, N.B. Pickering, U. Singh, D.C. Godwin, B. Baer, K.J. Boote, J.T. Ritchie, and J.W. White. 1994. Crop models. p. 95244. In G.Y. Tsuji et al. (ed.) DSSAT v3. University of Hawaii, Honolulu, HI.
- Irmak, A., J.W. Jones, T. Mavromatis, S.W. Welch, K.J. Boote, and G.G. Wilkerson. 2000. Evaluating methods for simulating soybean cultivar responses using cross validation. Agron. J. 92:11401149.[Abstract/Free Full Text]
- Jones, J.W., K.J. Boote, G. Hoogenboom, S.S. Jagtap, and G.G. Wilkerson. 1989. SOYGRO V5.42 Soybean crop growth simulation model, User's Guide. Florida Exp. St. J. No. 8304. University of Florida, Gainesville, FL.
- Mavromatis, T., K.J. Boote, J.W. Jones, A. Irmak, D. Shinde, and G. Hoogenboom. 2001. Developing genetic coefficients for crop simulation models using data from crop performance trials. Crop Sci. 41:4051.[Abstract/Free Full Text]
- Piper, E.L., K.J. Boote, and J.W. Jones. 1998. Evaluation and improvement of crop models using regional cultivar trial data. Appl. Eng. Agric. 14:435446.
- Piper, E.L., K.J. Boote, J.W. Jones, and S.S. Grimm. 1996. Comparison of two phenology models for predicting flowering and maturity date of soybean. Crop Sci. 36:16061614.[Abstract/Free Full Text]
- Raymer, P.L., J.L. Day, R.B. Bennet, S.H. Baker, W.D. Branch, and M.G. Stephenson. 1993. 1992 Field Crops performance tests: Soybeans, peanuts, cotton, tobacco, sorghum, and summer annual forages. Research Report, No. 618, University of Georgia.
- Raymer, P.L., J.L. Day, R.B. Bennet, S.H. Baker, W.D. Branch, and M.G. Stephenson. 1994. 1993 Field Crops performance tests: Soybean, peanut, cotton, tobacco, sorghum, and summer annual forages. Research Report, No. 627, University of Georgia.
- Raymer, P.L., J.L. Day, R.B. Bennet, R.D. Gipson, S.H. Baker, W.D. Branch, and M.G. Stephenson. 1992. 1991 Field Crops performance tests: Soybeans, peanuts, cotton, tobacco, sorghum, and summer annual forages. Research Report, No. 609, University of Georgia.
- Raymer, P.L., J.L. Day, A.E. Coy, S.H. Baker, W.D. Branch, and S.H. LaHue. 1997. 1996 Field Crops performance tests: Soybean, peanut, cotton, tobacco, sorghum, grain millet and summer annual forages. Research Report, No. 644. University of Georgia.
- Raymer, P.L., J.L. Day, A.E. Coy, S.H. Baker, W.D. Branch, and M.G. Stephenson. 1995. 1994 Field Crops performance tests: Soybean, peanut, cotton, tobacco, sorghum, grain millet, and summer annual forages. Research Report, No. 633. University of Georgia.
- Raymer, P.L., J.L. Day, A.E. Coy, S.H. Baker, W.D. Branch, and M.G. Stephenson. 1996. 1995 Field Crops performance tests: Soybean, peanut, cotton, tobacco, sorghum, grain millet, and summer annual forages. Research Report, No. 639. University of Georgia.
- Raymer, P.L., J.L. Day, C.D. Fisher, and R.H. Heyerdahl. 1988. 1987 Field crops performance tests: Soybeans, peanuts, cotton, tobacco, sorghum, summer annual forages, and sunflowers. Research Report, No. 556, University of Georgia.
- Raymer, P.L., J.L. Day, C.D. Fisher, and R.H. Heyerdahl. 1989. 1988 Field crops performance Tests: Soybeans, peanuts, cotton, tobacco, sorghum, summer annual forages, and sunflowers. Research Report, No. 568, University of Georgia.
- Raymer, P.L., J.L. Day, and R.D. Gipson. 1990. 1989 Field crops performance tests. Research Report, No. 589, University of Georgia.
- Raymer, P.L., J.L. Day, R.D. Gipson, S.H. Baker, W.D. Branch, and M.G. Stephenson. 1991. 1990 Field crops performance tests: Soybeans, peanuts, cotton, tobacco, sorghum, and summer annual forages. Research Report, No. 599, University of Georgia.
- Ritchie, J.T., D.C. Godwin, and U. Singh. 1989. Soil and water inputs for the IBSNAT models. p. 3145. In Proceedings of IBSNAT Symposium: Decision Support System for Agrotechnology Transfer. University of Hawaii, Honolulu, HI.
- Tsuji, G.Y., G. Uehara, and S. Balas (ed.) 1994. DSSAT version 3. University of Hawaii, Honolulu, HI.
- Willmott, C.J. 1982. Some comments on the valuation of model performance. Bull. Am. Meteorol. Soc. 63:13091313.
This article has been cited by other articles:

|
 |

|
 |
 
W. Putto, A. Patanothai, S. Jogloy, and G. Hoogenboom
Determination of Mega-Environments for Peanut Breeding Using the CSM-CROPGRO-Peanut Model
Crop Sci.,
May 1, 2008;
48(3):
973 - 982.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. W. White, K. J. Boote, G. Hoogenboom, and P. G. Jones
Regression-Based Evaluation of Ecophysiological Models
Agron. J.,
February 6, 2007;
99(2):
419 - 427.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
C. D. Messina, J. W. Jones, K. J. Boote, and C. E. Vallejos
A Gene-Based Model to Simulate Soybean Development and Yield Responses to Environment
Crop Sci.,
January 24, 2006;
46(1):
456 - 466.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
G. J. Carbone, L. O. Mearns, T. Mavromatis, E. J. Sadler, and D. Stooksbury
Evaluating CROPGRO-Soybean Performance for Use in Climate Impact Studies
Agron. J.,
May 1, 2003;
95(3):
537 - 544.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
K. J. Boote, J. W. Jones, W. D. Batchelor, E. D. Nafziger, and O. Myers
Genetic Coefficients in the CROPGRO-Soybean Model: Links to Field Performance and Genomics
Agron. J.,
January 1, 2003;
95(1):
32 - 51.
[Abstract]
[Full Text]
[PDF]
|
 |
|