|
|
||||||||
a USDA-ARS, Box 646402, Washington State Univ., Pullman, WA 99164
b Dep. of Statistics, Washington State Univ., Pullman, WA 99164. Mention of product names does not represent and endorsement of any product or company by the USDA at the exclusion of other suitable products
* Corresponding author (rcjohnson{at}wsu.edu).
| ABSTRACT |
|---|
|
|
|---|
Abbreviations: AFLP, amplified fragment length polymorphism AWC, Arizona Wild Composite GRIN, Germplasm Resources Information Network PCR, polymerase chain reaction RAPD, random amplified polymorphic DNA
| INTRODUCTION |
|---|
|
|
|---|
Safflower, a diploid with 12 chromosome pairs (Ashri and Knowles, 1960), is a predominately self-pollinating species, but has the potential for considerable outcrossing with pollen transfer by a variety of insects (Butler et al., 1966). The USDA-ARS maintains a collection of safflower germplasm at the Western Regional Plant Introduction Station (WRPIS), Pullman, WA, which currently includes more than 2300 accessions. These accessions, representing germplasm from more than 50 countries, are available without charge to scientists worldwide.
Molecular markers can be used for identifying duplicate accessions, developing and testing special groups within collections (such as core collections), estimating and comparing diversity among countries or regions, and identifying acquisition needs and in genetic mapping. Compared to many other crops, the use of molecular markers in safflower has been limited. Using isozymes, Bassiri (1977) identified wild safflower ecotypes and Carapetian and Estilai (1997) identified safflower hybrids. Zhang (2001) characterized 89 safflower accessions from numerous countries with isozymes.
Less reported for safflower are methods using the polymerase chain reaction (PCR). Random amplified polymorphic DNA (RAPD) markers were used by Yazdi-Samadi et al. (2001) to detect variation in 28 safflower accessions including Iranian landraces. They concluded that RAPD markers were useful for characterizing safflower diversity at the DNA level. Sehgal and Raina (2005) characterized 14 Indian safflower cultivars using RAPD, simple sequence repeats, and amplified fragment length polymorphism (AFLP). AFLP markers were found to be the most efficient system in their study, with two primer pairs sufficient to genotype the cultivars.
Characterization of safflower with molecular markers from diverse world sources is needed to enhance germplasm management and utilization. The objectives of this research were (i) to characterize AFLP variation and structure within and among safflower germplasm from diverse genetic sources and geographic regions, and (ii) to determine the association between phenotypic factors and AFLP marker data.
| MATERIALS AND METHODS |
|---|
|
|
|---|
|
Emerging tissue from the 12 plants representing each accession was sampled and combined (bulked), resulting in approximately 1-cm2 total leaf area per sample. Thus, these 96 samples consisted of accessions within each region and designated as regional samples. For eight of the 96 accessions, a separate 1-cm2 sample was taken from each of the 12 plants representing each accession. These were designated as population samples, representing plants within accessions. The eight populations were selected to represent a range of diversity. This included the highly variable AWC (PI 537682), a population developed though crosses with wild Carthamus species (Rubis et al., 1966), and the cultivar Girard (PI 525457), expected to have far less diversity through the process of breeding for uniform characteristics. The other six populations were selected at random from each of the six regions outside the Americas.
Tissue was freeze dried and placed in a 1.5-mL ependorf tube with six 3-mm-diameter glass beads. Tubes were placed in a plastic tube rack and shaken in a paint shaker until pulverized. Extraction of DNA was completed using the MagneSil kit by Promega (Madison, WI).
Amplified fragment length polymorphism genotyping similar to the methods of Vos et al. (1995) was performed using AFLP Analysis System I kits from Life Technologies (Gaithersburg, MD). The manufacturer's protocol was used with the following noted modifications. Restriction digestion was completed in a 25-µL reaction volume using 375 ng of template DNA. To confirm complete digestion, a 12.5-µL sample of DNA was visualized using agarose gel electrophoresis. A 12.5-µL volume of the adaptor–ligation mixture was added to the remaining restriction digestion to bring the total reaction volume to 25 µL. Following incubation, this reaction was diluted 1:10 and used as template for PCR pre-amplification. Pre-amplification was done in an 11-µL volume with 0.5 U of Taq polymerase (Biolase DNA Polymerase from Bioline Life USA, Inc., Randolph, MA), 2.5 mM MgCl2, 8 µL of Pre-amp Primer Mix from the Life Technologies kit, and 1.25 µL of template DNA. The thermocycler was programmed for 20 cycles of 94°C for 30 s, 56°C for 60 s, 72°C for 60 s, followed by 72°C for 60 s. The completed pre-amplification reaction was diluted 1:10 with purified water and used as template for selective amplification. The selective amplification was modified to a 10-µL reaction, including 0.5 U of Taq polymerase, 1.5 mM MgCl2 and the accompanying buffer, 2 µL of MseI primer mix from the Life Technologies kit, 0.5 pmol of fluorescent-labeled EcoRI primer (MWG-BioTech, High Point, NC) and 2 µL of pre-amplified template DNA. The thermocycler was programmed for 13 cycles of 94°C for 30 s, 65°C for 30 s with a temperature decrease of –0.7°C per cycle, 72°C for 60 s; followed by 23 cycles at 94°C for 30 s, 56°C for 30 s, 72°C for 60 s, and finally 72°C for 7 min. Separation and visualization of the markers was done on 6.5% KB+ polyacrylamide using a Li-Cor Gene ReadIR 4200 (Lincoln, NE). Images were printed for visual scoring using GeneImager software from Scanalytics (Rockville, MD). The selective nucleotides for the primer pairs used were:
A matrix consisting of 1 (marker present) and 0 (marker absent) was developed by scoring clear, polymorphic bands and resulted in 102 markers used for both population and bulk samples.
The 1,0 matrix was used to calculate pairwise distances between individuals within the eight populations and between accessions within regions. The distance coefficient was the proportion of unmatched markers between a given pair of entries; this is equal to one minus the simple matching coefficient described in Romesburg (1984). This distance coefficient potentially ranges from zero, when all markers match, to one if there are no matches. Cluster analysis was completed using a distance matrix containing all population and bulk sample data with NTSYS 2.2 software (Exeter Software, Setauket, NY) and the UPGMA algorithm.
A nonparametric bootstrap procedure was used for computing standard errors of the mean genetic distance (SE) within and between populations and regions based on the AFLP markers. In addition, a test statistic and bootstrap procedure was developed for assessing differences in distance between populations or regions. First, for the ith and jth population or region, the average genetic distance within a given population or region was computed from the 1,0 matrix and denoted as
i and
j. Likewise, the estimated average genetic distance between two populations or regions was computed for the ith and jth population or region and denoted as
ij. Using results of three plants from two of the populations sampled, the AWC and the cultivar Girard, a distance matrix is presented to illustrate how the within and between mean population distances are calculated (Table 2). The same calculation procedure was used for regions except the bulked accessions within regions were used instead of the plants within populations.
|
i and SE
j) was computed from the sample SD as shown by Efron and Tibshirani (1993). These were used to calculate the error term in a z test,
![]() | [1] |
For comparing differences in mean genetic distances between populations or regions, procedures similar to Excoffier et al. (1992) were used. This involved computing the test statistic,
![]() | [2] |
i,
j, and
ij were as defined above, and the sum is across all populations or regions. The null hypothesis is that populations or regions are derived from a common genetic source, with the mean within genetic distance equal to the mean between genetic distance. Thus, except for random sampling variation, W would be zero under the null hypothesis. However, if populations or regions differ, W would be greater than zero. A bootstrap procedure was used to assess this hypothesis by generating the distribution of W using pooled marker data for all populations or regions. The 1,0 arrays representing plants within populations or accessions within regions were sampled at random with replacement from the pooled data. The pseudosamples were the same size as for the original data. From these pseudosamples, the W values were generated. This process was repeated 10000 times and the null hypothesis of no difference rejected if no more than 5% (P < 0.05) of the pseudovalues of W exceeded the value of W computed from the original sample data.
Bulk DNA samples representing accessions from world regions were structured into clusters using the model-based method developed by Pritchard et al. (2000a). For this analysis, the regional hierarchy in Table 1 was not used. Thus, the structure of the groups or clusters was based solely on the analysis of marker frequencies. The aim was to develop independent groups to understand if, and to what extent, the regions were distinguished without the a priori regional classification. We used the STRUCTURE software Version 2.1 2004 (available from http://pritch.bsd.uchicago.edu/software.html) and statistics outlined by Evanno et al. (2005) to determine the number of structured groups (K clusters). Program settings used the admixture ancestry and correlated marker frequency models. The graphs of L(K), as defined by Evanno et al. (2005), along with its variance, minimum alpha, and the modal values of
K, were used to determine the number of clusters used for estimating admixtures. In simulations, Evanno et al. (2005) found that
K was a good predictor of the true cluster number. The length of burn-in was set at 10000 followed by 30000 iterations. Five replications were performed at each proposed K. The factor Q, the proportion of markers from each region, was calculated for each cluster.
The Gst statistic gives the proportion of genetic diversity that resides among populations or groups. It was calculated as outlined by Culley et al. (2002) using the Nei (1973) method in which the average of Ht (the total genetic diversity) and Hs (the diversity within the populations or subgroups) are taken across all loci. The resulting means were used to calculate Gst = 1 – (Hs/Ht). This was done for both the populations and regions. In the case of AFLP markers, gene diversity refers to marker diversity.
Matrix correlation and Mantel tests were completed using NTSYS 2.2 software to determine if and to what extent the AFLP marker data was associated with phenotypic data. Nine oil and meal characteristics and seven growth descriptors for the safflower core collection were reported by Johnson et al. (1999, 2001). The oil and meal characteristics were the percent oil, linoleic acid, oleic acid, palmitic acid, and stearic acid;
- and ß-tocopherol concentration in the oil; and percent 2-hydroxyarctiin and matairesinol phenolic glucosides in the flour. Other phenotypic attributes were the width and length of the outer involucral bract, head diameter, days to flowering, plant height, yield per plant, and average seed weight. Oil, meal, and the growth data from those studies were available for 90 of the 96 accessions in the bulk DNA analysis. The six accessions without phenotypic data were not included in the correlation. Five were from the Americas (PI numbers 525457, 560200, 537682, 537692, and 613394) and one was from the Mediterranean region (PI 613465). The phenotypic data were standardized to have a mean of zero and a variance of one, and a pairwise distance matrix was developed using the average Euclidean distance coefficient with NTSYS 2.2 software. For the AFLP data the distance coefficient defined above was used. Matrix correlation and Mantel tests were performed for the entire data set and for each region separately. For the Mantel test 1000 permutations were calculated.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
|
Our hypothesis was that AFLP marker diversity would be higher within the AWC population than the cultivar Girard (PI 525457). This is because the AWC was developed through crosses of several wild Carthamus species to domestic safflower (Rubis et al., 1966). As such, it is the most variable safflower population known. Girard, on the other hand, resulted from the process of cultivar development (Bergman et al., 1989) in which selection for high phenotypic uniformity is desired and consequently less diversity within the population would be expected. And indeed, the AWC population had a mean distance within that was more that 20 times greater than that of Girard (Table 3). Differences in variability within populations were detected in 22 of 28 possible comparisons (Table 3). Even so, most populations had relatively low average genetic distance (Table 3). Among the eight populations studied only the AWC had a mean within population distance of more than 0.0859, showing the relatively high uniformity within populations. PI 544006 from China had a very low within populations distance, showing the high level of marker uniformity that can occur in safflower. Overall, the relatively high within uniformity is likely associated with the predominance of inbreeding associated with self pollination together with selection. However, the potential for insect mediated outcrossing can lead to significant variation, especially within unimproved populations.
In all cases, plants from the eight populations (Table 3) clustered with their respective distances based on the analysis of bulked tissue. Bulking could change the pattern of marker visualization and thus the distance scores compared to populations. However, our bulk sampling protocol was found to accurately represent accessions. This is shown for the AWC and Girard (Fig. 1 ). The bulk AWC sample was less centered in the population than the other seven bulk samples but it was also highly variable compared to other populations.
|
In all cases except China, different countries were combined to form regions (Table 1). The country with the most accessions was the USA with six each originating from the breeding programs of P. Knowles (California) and D. Rubis (Arizona), and one accession from the breeding program of J. Bergman (Montana) (Table 1). For the Americas, germplasm from Argentina, and Canada was also included. The mean distances within regions ranged from 0.159 for the Americas to 0.098 for South Central Asia (Table 4). The within region comparisons showed that the Americas group was more diverse than the other regions (Table 4, above diagonal). The relatively high diversity of the Americas group can be attributed to the wide range of genetic material used in U.S. breeding programs and the selection of progeny from a wide range of environments. This included various hull characteristics, disease resistance, meal attributes, and even wild safflower genes represented in the AWC. Only two of the 16 accessions were cultivars. For other regions the accessions tended to be landraces used by local people without the introduction of more exotic germplasm typical of safflower breeding programs (Table 1).
|
The STRUCTURE procedure resulted in nine clusters that were formed without the a priori regional classification (Table 1). In Table 5, region of origin information was presented so the regional classification could be compared with the independently formed STRUCTURE clusters. In general, the results of the STRUCTURE analysis showed why differences were found between regions (Table 4). Membership of the majority of markers originated from a single region in three cases (cluster 3b, China, 81%; cluster 2, Southwest Asia, 64%; and cluster 4b East Europe, 63%) (Table 5). East Africa and South Central Asia comprised nearly half the membership in cluster 1a and 1b, and the Mediterranean region was strongly represented in cluster 1c. Thus, six of the seven regions were predominantly or strongly represented in six clusters.
|
The association between the Americas and China was less anomalous than it might first appear. GRIN records show the Chinese cultivar TA-1 (Table 1) was developed from a cross between Tacheng, a local cultivar from the Xinjiang region of northwestern China, near the town of Tacheng, and the cultivar AC-1. AC-1, a high seed oil type developed by the Anderson Clayton Company, Phoenix, AZ, has been used extensively in breeding programs in the United States (Bergman et al., 1985), especially in the Montana State University releases Sidwell, Oker, Hartman, and Rehbein. So it appeared that germplasm exchange was a major factor in the admixture between China and the Americas in cluster 4a1. The dispersion of membership from the Americas across different clusters is logical as all the safflower in the Americas was introduced from other world regions. Nevertheless, there was still enough of a difference between the Americas and other regions to statistically distinguish the Americas from all other regions (Table 4).
As measured by the Gst statistic, the proportion of variation among the populations in Table 3 was 56%. This showed the majority of the marker variation was among rather than within populations, yet the contribution of both was substantial. These populations were selected from widely diverse origins to measure the widest range of population diversity possible, so the relatively large diversity among populations is reasonable. However, this result may not be representative of populations derived at random.
For the bulk samples, the variation among regions was only 25% of the total leaving 75% within the regions. Nevertheless, differences among regions were strong enough to be significant in all cases (Table 4). Except for the Americas, none of the mean distances within regions differed significantly (Table 4). This shows that although variation within regions was predominant it was also relatively uniform across most regions.
There is a continued need for direct evaluation for desired traits such as oil quality, but if correlation between diversity measured by molecular markers and diversity measured by phenotypic factors were high, it would facilitate management of genetic resources, as overall diversity could be largely assessed in the laboratory. The overall correlation of the distance matrices based on AFLP data with average Euclidian distance matrices based on phenotypic factors was significant (Table 6). Correlations within regions were also significant for China, Southwest Asia, and the Mediterranean regions (Table 6). The correlation within China was the strongest, accounting for 39% of the variation, but in most cases the correlations within regions were either weak or not significant. This was true of the overall correlation, which explained just 1.4% of the variation (r = 0.12), and was not strong enough to displace the need for overall phenotypic characterization for diversity in safflower. This is not an unusual result. Johnson et al. (2002) found a correlation between RAPD markers and phenotypic data of r = 0.14 in Kentucky bluegrass (Poa pratensis L.). Reed and Frankham (2001) examined 71 data sets, mostly with allozyme data, and found that the mean correlation coefficient between molecular and phenotypic variation was 0.36, but lower for correlations based on additive genetic variance. In the absence of linkage, Reed and Frankham (2001) suggested the reason for the low correlations is that variation in molecular markers result mostly from genetic drift whereas adaptive phenotypic variation is driven more by selection. However, Merilä and Crnokrak (2001) found relatively high correlations between standardized measurements of marker data and quantitative data. Given the random nature of AFLP markers they would not be expected to strongly correlate with phenotypic factors. However, the weak but significant correlation does show a degree of correspondence. This correspondence can be quite strong in some cases, for example, when comparing the molecular distances of the AWC and Girard (Fig. 1) with phenotypic observations. Field observations confirm the high phenotypic variation in the AWC for factors such as plant height, spininess, flower color, and maturity—variation not expected or observed to a great extent in cultivars.
|
| NOTES |
|---|
|
|
|---|
Received for publication December 1, 2006.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
W. F. Anderson, A. Maas, and P. Ozias-Akins Genetic Variability of a Forage Bermudagrass Core Collection Crop Sci., June 26, 2009; 49(4): 1347 - 1358. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| The SCI Journals | Agronomy Journal | Vadose Zone Journal | |||
| Journal of Natural Resources and Life Sciences Education |
Soil Science Society of America Journal | ||||
| Journal of Plant Registrations | Journal of Environmental Quality |
The Plant Genome | |||