Crop Science Illumina
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Published online 1 September 2007
Published in Crop Sci 47:1964-1974 (2007)
© 2007 Crop Science Society of America
677 S. Segoe Rd., Madison, WI 53711 USA
This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by De Riek, J.
Right arrow Articles by Vosman, B.
Right arrow Search for Related Content
PubMed
Right arrow Articles by De Riek, J.
Right arrow Articles by Vosman, B.
Agricola
Right arrow Articles by De Riek, J.
Right arrow Articles by Vosman, B.
Related Collections
Right arrow Biometrics
Right arrow Sugarbeet
Right arrow Data Management

PLANT GENETIC RESOURCES

Assignment Tests for Variety Identification Compared to Genetic Similarity-Based Methods Using Experimental Datasets from Different Marker Systems in Sugar Beet

J. De Rieka,*, I. Everaerta, D. Esselinkb, E. Calsyna, M. J. M. Smuldersb and B. Vosmanb

a ILVO-Institute for Agricultural and Fisheries Research, Plant Science Unit, Caritasstraat 21, 9090 Melle, Belgium
b Plant Research International, P.O. Box 16, 6700 AA Wageningen, the Netherlands

* Corresponding author (jan.deriek{at}ilvo.vlaanderen.be).


    ABSTRACT
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
High genetic variation within sugar beet (Beta vulgaris L.) varieties hampers reliable classification procedures independent of the type of marker technique applied. Datasets on amplified fragment length polymorphisms, sequence tagged microsatellite sites, and cleaved amplified polymorphic sites markers in eight sugar beet varieties were subjected to supervised classifiers, methods in which individual assignments are made to predefined classes, and unsupervised classifiers, defined afterward on the similarity in marker composition from sampled individuals. Major issues addressed are (i) which classification method gives the most consistent results when three marker techniques are compared, and (ii) given different classification techniques available, for which marker technique is the output generated least constrained by the way data analysis is performed. Assignment tests showed a higher consistency across classifications independent from the marker technique. A good allocation to the proper variety was obtained, together with a reliable allocation pattern among the other varieties. Both aspects deal with the variation within a variety and the distance to other varieties. Assignment data were transformed into an average similarity measure, similarity by assignment (Sax,y), which is a new genetic distance measure with interesting properties.

Abbreviations: AFLP, amplified fragment length polymorphism • CAPS, cleaved amplified polymorphic site • DEucl, Euclidean distance • DNei, Nei genetic distance • PCR, polymerase chain reaction • SJacc, Jaccard similarity coefficient • SSM, simple matching similarity coefficient • STMS, sequenced tagged microsatellite site

Assignment Tests for Variety Identification Compared to Genetic Similarity-Based Methods Using Experimental Datasets from Different Marker Systems in Sugar Beet

J. De Rieka,*, I. Everaerta, D. Esselinkb, E. Calsyna, M. J. M. Smuldersb and B. Vosmanb

a ILVO-Institute for Agricultural and Fisheries Research, Plant Science Unit, Caritasstraat 21, 9090 Melle, Belgium
b Plant Research International, P.O. Box 16, 6700 AA Wageningen, the Netherlands

* Corresponding author (jan.deriek{at}ilvo.vlaanderen.be).

High genetic variation within sugar beet (Beta vulgaris L.) varieties hampers reliable classification procedures independent of the type of marker technique applied. Datasets on amplified fragment length polymorphisms, sequence tagged microsatellite sites, and cleaved amplified polymorphic sites markers in eight sugar beet varieties were subjected to supervised classifiers, methods in which individual assignments are made to predefined classes, and unsupervised classifiers, defined afterward on the similarity in marker composition from sampled individuals. Major issues addressed are (i) which classification method gives the most consistent results when three marker techniques are compared, and (ii) given different classification techniques available, for which marker technique is the output generated least constrained by the way data analysis is performed. Assignment tests showed a higher consistency across classifications independent from the marker technique. A good allocation to the proper variety was obtained, together with a reliable allocation pattern among the other varieties. Both aspects deal with the variation within a variety and the distance to other varieties. Assignment data were transformed into an average similarity measure, similarity by assignment (Sax,y), which is a new genetic distance measure with interesting properties.

Abbreviations: AFLP, amplified fragment length polymorphism • CAPS, cleaved amplified polymorphic site • DEucl, Euclidean distance • DNei, Nei genetic distance • PCR, polymerase chain reaction • SJacc, Jaccard similarity coefficient • SSM, simple matching similarity coefficient • STMS, sequenced tagged microsatellite site


    INTRODUCTION
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
THE GENETIC CHARACTERIZATION by means of molecular markers offers an appealing approach to variety registration and protection. Molecular markers can provide a fast and reliable identification tool applicable during all stages of seed production, trading, and agricultural production and processing. These properties converge with the demand of the seed companies for better protection of hybrids and inbred lines. For many crops, the number of informative morphological characteristics is limited. The high variation within varieties hampers fingerprinting molecular markers and the construction of reference databases containing molecular profiles.

Different types of biochemical and molecular markers have been developed and used in sugar beet (Beta vulgaris L.). Biochemical markers (i.e., isozymes or protein patterns) are laborious (Jung et al., 1993) and have a low degree of polymorphisms (Schneider et al., 1999). On the other hand, random amplified polymorphic DNA is insufficiently reproducible across years and laboratories (Barzen et al., 1995; Jones et al., 1997). In sugar beet, amplified fragment length polymorphisms (AFLPs) (Barnes et al., 1996; Barzen et al., 1995; Pillen et al., 1992; Schondelmaier et al., 1996; Schumacher et al., 1997) and microsatellites (Rae et al., 2000) have been used for mapping and fingerprinting. Cleaved amplified polymorphic site (CAPS) markers have also been identified (Paran and Michelmore, 1993). Sequenced tagged microsatellite site (STMS) markers have been developed for sugar beet (Rae et al., 2000). Microsatellite markers were shown to be effective in variety identification (Esselink et al., 2003; Bredemeijer et al., 2002; Röder et al., 2002).

We analyzed three different marker datasets (AFLP, STMS, and CAPS). Two types of data analysis were compared: supervised classifiers and unsupervised classifiers. Supervised classifiers represent a group of methods in which individual assignments are made to predefined classes. Unsupervised classification classes are defined a posteriori based on the degree of difference or similarity in marker composition from sampled individuals (Guinand et al., 2002). Two major issues are addressed in this study. First, classification methods were evaluated for using the three marker techniques. Second, given the different classification techniques available, marker techniques were compared to find out which one yields the most reliable or the least constrained summarizing output, independent from the way data analysis was being performed. Finally, we discuss the potential of assignment tests for the identification or evaluation of varieties.


    MATERIALS AND METHODS
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
Plant Material, DNA Isolation, and Marker Analysis
Eight sugar beet varieties were included in the present study; six were triploid varieties, two diploid: Ariana (KWS Saat AG, Einbeck, Germany), Aurelia (KWS), Fortis (2n = 2x; Hilleshög, Syngenta Seeds, Landskorna, Sweden), Princesse (Delitzsch Pflanzenzucht, Winsen, Germany), Sylvester (Vanderhave, Advanta, Rilland, the Netherlands), H66377 (Vanderhave), KWS8123 (2n = 2x; KWS), and MK9907 (Kühn & Co. International B.V., Bergen op Zoom, the Netherlands). Seeds were obtained from the Belgian sugar beet research institute KBIVB-Tienen that is also responsible for variety testing. Thirty individual plants per variety were analyzed. DNA isolation and AFLP (Vos et al., 1995) analysis were described in De Riek et al. (2001) using the commercially available AFLP kit from PerkinElmer Life and Analytical Sciences, Inc. (Waltham, MA) for fluorescent fragment detection. EcoRI and MseI were used for DNA digestion. Three primer combinations with six selective bases were applied: EcoRI-ACA and MseI-CTG; EcoRI-ACT and MseI-CAT; and EcoRI-AGG and MseI-CTT (De Riek et al., 2001).

Microsatellites were isolated from enriched small-insert genomic libraries (Esselink, unpublished data, 2000); 12 STMS markers were used: Bvv 15, Bvv 17, Bvv 21, Bvv 23, Bvv 30, Bvv 32, Bvv 43, Bvv 51, Bvv 53, Bvv 60, Bvv 61, and Bvv 64. Sequenced tagged microsatellite site primers were amplified in a 20-µL reaction volume containing 20 ng of genomic DNA, 2 to 10 pmol of each primer, 100 µmol L–1 of each dNTP, 10 mmol L–1 Tris-HCl pH 9.0, 20 mmol L–1 (NH4)2SO4, 0.01% Tween 20, 1.5 mmol L–1 MgCl2, and 0.3 U Goldstar Taq DNA polymerase. Amplifications were performed using a PerkinElmer 9600; polymerase chain reaction (PCR) conditions were 94°C for 3 min followed by 30 cycles of 94°C for 30 s, at the calculated annealing temperature for 30 s, 72°C for 60 s, and a final extension at 72°C for 3 min. According to corresponding reaction conditions different multiplex sets were composed, each containing three microsatellite loci labeled with different fluorescent dyes (FAM, HEX, NED).

Amplified fragment length polymorphism and STMS fragments were separated by polyacrylamide gel electrophoresis on an ABI Prism 377 DNA Sequencer (Applied Biosystems, Foster City, CA) on 36-cm gels using 4.25% denaturing polyacrylamide (4.25% acrylamide/bisacrylamide 19:1, 6 mol L–1 urea in 1x TBE). GS-500 ROX-labeled size standard (PerkinElmer) was loaded in each lane to facilitate the automatic analysis of the gel and the sizing of the fragments. Genescan 2.1 (Applied Biosystems) was used to estimate detection time, signal peak height, and surface for each fragment. Amplified fragment length polymorphism scoring was conducted as described by De Riek et al. (2001). For STMS analysis, only a selected set was used; null alleles were ignored, and alleles whose frequency was below 1% were excluded.

Cleaved amplified polymorphic site markers were selected from a list of codominant markers developed for sugar beet by Schneider et al. (1999; Table 1 ). Polymerase chain reaction conditions were as specified by Schneider et al. (1999). The fragments were separated on a 2% Tris-borate-EDTA–buffered agarose gel after electrophoresis (4 h, 80 V). A scoring table (1/0) was generated by visual scoring after ethidium bromide staining and UV lighting of the gel. According to a GeneRuler 100bp Plus and a GeneRuler 50bp ladder (Fermentas International Inc., Burlington, ON) fragment size was estimated. For all markers, CAPS alleles were scored as present or absent.


View this table:
[in this window]
[in a new window]

 
Table 1. Gene localization and restriction enzyme used for the cleaved amplified polymorphic site (CAPS) primer sets taken from Schneider et al. (1999).

 
Statistical Analyses
The Jaccard (SJacc) and simple matching similarity (SSM) coefficients between two genotypes, the Euclidean distance (DEucl) and the Nei distance (DNei) between two populations, and bootstrapping procedures were as in De Riek et al. (2001). A "dominant" scoring was used for all marker techniques, that is, presence of marker bands (marker frequencies) instead of allelic composition (allelic frequencies).

Analysis of molecular variance (AMOVA) was applied (Schneider et al., 2000) to the Euclidean distance matrix between individual genotypes to partition genetic variation among ploidy level and varieties. The assignment based on the highest probability of an individual's genotype in any of the populations was calculated using the "Doh" software (http://www2.biology.ualberta.ca/jbrzusto/ verified 7 August 2007) starting from the 0/1 data as described by Paetkau et al. (1995, 1997). This includes the calculation of a matrix of distances (d) between each pair of populations, calculated as

Formula 1[1]
based on the nonsymmetric matrix A defined by

Formula 2[2]
where x, y are populations, nx is the size of population x, gi is the genotype of individual i, and Prx is the genotype probability calculated in population x. Ax,y is a measure of how much more likely genotypes of individuals sampled in population x are in population x than in population y. A is not symmetric.

For the assignment tests based on the pairwise similarity matrix (De Riek et al., 2001), first a ranking of the 30 most similar partners was made per individual plant. For this, a (240 by 240) similarity matrix was constructed, using SJacc or SSM. Second, per individual plant, the origin variety for each of the, for example, 10 individual genotypes that show the highest similarity was recorded (we used the 3, 10, or 30 most highly ranking partners). The partitioning of the origin of these highly ranking partners over all varieties in the dataset is then displayed in an assignment table for each variety under analysis, grouping the assignments of all individuals of the variety under analysis. In an identical way as described by Paetkau et al. (1997) for the derivation of dx,y from Ax,y, (and shown above) this asymmetric assignment table was converted into a symmetric similarity measure, similarity by assignment (Sax,y), by making it a relative value and simply averaging the assignment values of Vassx,y and Vassy,x.

Mantel analysis (Mantel, 1967; Mantel Nonparametric Test Calculator for Windows, Version 2.00, 1999, by Adam Liedloff) computed standardized Mantel's statistics between two similarity matrices. The significance of the statistic was evaluated by permutations (1000x) and expressed as a probability (Smouse et al., 1986).


    RESULTS AND DISCUSSION
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
Characteristics of the Experimental Data Sets
In total, 405 AFLP, 12 STMS, and 10 CAPS markers were selected for genotyping this set of varieties. For all marker techniques, the presence or absence of (allelic) bands in the individual plants were scored. For the STMS markers, in total 53 different alleles with a allelic band frequency above 0.01 were scored; for 48 of these the allelic band frequency was above 0.05. Ten CAPS markers taken from Schneider et al. (1999) gave, in total, 57 bands.

In Table 2 some key figures differentiating the power of the three molecular markers used are listed. For comparison of the two codominant techniques, the average numbers of bands per locus for the total dataset or on average per single variety are listed. Power of discrimination between individual plants can also be evaluated by the number of unique allelic phenotypes (Becher et al., 2000; Esselink et al., 2003), representing a unique combination of alleles for a particular codominant marker in a given genotype. This approach was introduced to circumvent the problem of determining the actual genotype in a polyploid species if one cannot exactly determine whether a specific allele is present in two or three copies (Esselink et al., 2004, Nybom et al., 2004). These descriptive statistics (Table 2) make clear that the large number of AFLP markers outcompete the codominant marker datasets in characteristics related to the amount of data points. The datasets for STMS and CAPS are comparable for most statistics, including the number of allelic phenotypes that they can distinguish within the total dataset. However, STMS tend to detect fewer allelic phenotypes within one variety, which suggest that this particular dataset may be superior in revealing differences between varieties.


View this table:
[in this window]
[in a new window]

 
Table 2. Key figures of the three molecular methods used.

 
Unsupervised Classification based on Differences in Similarity and Clustering
Genetic Distances between Individual Plants
For the separate marker datasets, individual pairwise similarity matrices (240 by 240) were constructed using SJacc and SSM and the binary DEucl; here on, Mantel's tests were applied to evaluate the concordance of the genetic relationships revealed by each of the marker techniques. The Mantel correlation coefficient r ranged from –0.007 for the comparison AFLP and CAPS to 0.14 for the comparisons CAPS to STMS and AFLP to STMS (P < 0.02). These values indicated a rather poor correspondence between the data structure of the three matrices at the individual plant level. However, when comparing varieties, the direct relationships between the individual genotypes present within varieties is not the only concern but merely the overall view on the group of genotypes making a variety. Therefore, we took the analysis to that level. Table 3 gives the average SJacc and SSM taken over all pairwise comparisons between two accessions for each marker technique. Values closer to 1 indicate higher similarity. Across all pairwise comparisons between plants within a variety, KWS8123 had the highest internal average similarity independently of the similarity measure or the marker dataset. The variety showing the lowest internal average similarity did depend on the dataset: for AFLP, Ariana, MK9907, and Sylvester are among the lowest; for CAPS, Fortis and MK9907; and for STMS, Sylvester, Ariana, and Princesse. When the internal average similarity of a certain variety was compared to the average similarities between this particular variety and the rest, the internal average similarity was always higher indicating that, with no exception, plants belonging to a particular variety are always on average more similar to themselves than to another variety. This indicates that all datasets at least partly reveal the genetic structure of the varieties.


View this table:
[in this window]
[in a new window]

 
Table 3. The average similarity taken over all pairwise comparisons between plants of two accessions for amplified fragment length polymorphism (AFLP), cleaved amplified polymorphic site (CAPS), and sequenced tagged microsatellite site (STMS) markers (below the diagonal Jaccard; above the diagonal Simple Matching coefficients). On the diagonal the average similarity for all pairwise comparisons internal to each variety is given for both coefficients.

 
Classification based on Ordinations from Marker Frequency Data
In Table 4 the standard DNei between pairs of varieties and its standard errors are given. Using the AFLP dataset the standard DNei range from 0.02 to 0.08; using the CAPS dataset, from 0.02 to 0.13; and using the STMS dataset, from 0.4 to 0.22. The range as obtained with this AFLP dataset is somewhat higher than reported before (De Riek et al., 2001) using the same AFLP primer combinations. However, in the previous study, 90 individual plants taken from three seed production years were analyzed, which makes the potential variation within a variety larger (and hence the distances between varieties lower).


View this table:
[in this window]
[in a new window]

 
Table 4. The standard Nei genetic distance (x 10–2) between pairs of varieties (below the diagonal) and its standard errors (above the diagonal) for the three marker techniques used.{dagger}

 
The range of DNei indicate that the discriminatory capacity was lowest for the AFLP dataset, despite its including the highest number of data points, and increased for CAPS and STMS. However, standard errors for CAPS and STMS data were much higher (Table 4 above diagonal) making them a less precise estimate.

Clustering and bootstrapping were used to compare groupings by the three marker techniques (i.e., AFLP, STMS, and CAPS), employing Nei's pairwise distances (Fig. 1 ). Dendrograms were constructed with the UPGMA-algorithm. Regardless the marker data set used, Ariana, Aurelia, and Princesse clustered together at the highest similarity level. With the CAPS dataset, Sylvester is also attributed to this cluster. However, AFLP and STMS bootstrap values restrict the cluster to Ariane and Aurelia. Clustering of MK9907, Sylvester, and H66377 with the AFLP data set was not supported by the other datasets.


Figure 1
View larger version (12K):
[in this window]
[in a new window]

 
Figure 1. Dendrograms from the standard Nei genetic distances for amplified fragment length polymorphism (AFLP), cleaved amplified polymorphic site (CAPS), and sequenced tagged microsatellite site (STMS) based ordinations (UPGMA-clustering). Bootstrap values are indicated at the nodes.

 
Supervised Classification Techniques Using AMOVA and Assignment Tests
Analysis of Molecular Variance
The AMOVA procedure (Excoffier et al., 1992) provides a general framework for the analysis of population genetic structure based on any distance matrix. We applied a genetic structure design on Euclidean distances between individual plant genotypes, with allocation of the variation to the ploidy level (diploid versus triploid varieties), and within ploidy level, to varieties. For the three marker datasets used, the major part of the variation could be attributed to variation within varieties ranging from almost 95% for the AFLP to 84% for the STMS datasets (Table 5 ). Only a small part of the variation was accounted for by ploidy differences.


View this table:
[in this window]
[in a new window]

 
Table 5. AMOVA table showing the distribution of the molecular variance according to groups (ploidy level) and populations with indication of derived fixation indices.{dagger}

 
Neither molecular method was very diagnostic in differentiating between varieties with the STMS method having the highest differentiation values of 14%. The F-statistic indices Fst (partitioning of all variation), Fsc (variation within each ploidy level), and Fct (variation due to the ploidy difference alone) reflect the above observations. Fst and Fsc estimates were always significant (1023 permutations) for the three data sets used, while Fct estimates were not significant. Fst was highest for the STMS (0.15) and lowest for the AFLP dataset (0.056). The population pairwise F-statistics matrices revealed a comparable data structure as the use of the standard Nei distance (data not shown).

Assignment based on the Highest Probability of an Individual's Genotype in Any of the Populations
The method described by Paetkau et al. (1995, 1997) starts with the calculation of the probability of the assignment (assignment index) of each individual plant to each variety (data not shown). Assignment tests typically generate asymmetric matrices, showing the number of plants attributed to a certain variety based on this index. For the three marker datasets, plants were in general classified to their original variety (Table 6 ). The highest assignments were obtained with the STMS dataset. Correlation between assignments based on the different marker datasets was >0.95, significant at P = 0.001. The method also generates a derived distance measure dx,y (Table 6); distances here are to be understood as the ratios of the probability that an individual plant belongs to the original variety compared to the other variety on a logarithmic scale. Only a poor agreement was obtained between dx,y matrices based on the different markers by Mantel testing, indicating that apart from the apparently different levels of the matrix dx,y, also the data structure of the matrices derived from the three marker datasets was different (Tables 6 and 8).


View this table:
[in this window]
[in a new window]

 
Table 6. Assignment of individual genotypes per variety and derived distance measure dx,y, based on the assignment indices according to Paetkau et al. (1995).{dagger}

 

View this table:
[in this window]
[in a new window]

 
Table 8. Comparison of marker techniques for different summarizing methods by standardized Mantel's statistics g (also expressed as a correlation coefficient r) between similarity matrices. P value was estimated by 1000 random iterations (out of 40,320 possible permutations).{dagger}

 
Assignment based on the Pairwise Similarity Data for Individual Plants
Estimates of within variety genetic variation were also directly assessed from the pairwise similarity data for individual plants using SJacc and SSM (Table 7 ). Compared to the method by Paetkau et al. (1995), this assignment method produces much more dispersion across different varieties. Marker datasets markedly differed in this respect. Similarities based on the AFLP dataset yielded the highest allocation among varieties, CAPS-based similarities were less dispersed. Sequence tagged microsatellite site-based similarities were clearly more distinguishing. This can be evaluated as the number of plants that were traced back to the proper variety. Here, the assignment based on the STMS dataset is much more variety specific indicating that here profiles of varieties are more typical.


View this table:
[in this window]
[in a new window]

 
Table 7. Assignment of individual genotypes per variety and derived similarity by assignment values (Sax,y) based on the pairwise Jaccard similarity data (top 10 most similar assigned plants for each plant).{dagger}

 
We propose that the allocation pattern among varieties of a single variety can be used as a measure for the variety differentiation, particularly as the allocation pattern among varieties was found to be relatively independent of the marker dataset used. For instance, Ariana and Aurelia at one side and KWS8123 at the other can be taken as examples of varieties that are very easily distinguishable from each other and from the remaining varieties in the datasets, although they clearly refer to a common gene pool. This can be seen from the (low) degree with which KWS8123 plants are allocated to Ariana and Aurelia. In addition, Princesse refers to this same breeding pool but to a lesser extent. The same can be observed for Sylvester, H66377, and MK9907.

For these assignments too, the correlation between assignments based on the different marker datasets was high (r > 0.88, significant at 0.001).

To better reveal the information content, Table 7 was turned into a symmetric similarity measure (Sax,y) by making it a relative value and simply averaging as introduced for dx,y in Doh (see Materials and Methods). As such, Sax,y can be compared to Table 3 as it also reports on variation within and between varieties in one format. In contrast to the similarity values in Table 3, the Sax,y results have much more internal structure to reveal differences within and between varieties. Note that Sax,y is a relative measure as it is influenced by the number of plants analyzed in each accession (here, equal numbers were taken) and, more important, by the overall composition of the set of varieties taken into the analysis.

Comparison of Approaches Used
Summarizing output has been generated for the three marker datasets under the form of different resemblance measures:

Two major issues need to be addressed: (i) Which classification method gives the most consistent results when the three marker datasets are compared, and (ii) which marker technique yields the most reliable or least constrained summarizing output? To test the concordance between the different summarizing matrices Mantel's tests were made (Tables 8 and 9 ). Table 8 describes the comparison of marker techniques for the different summarizing methods. The Mantel's statistic then indicates if, given a summarizing technique, the data structure revealed by the measure is consistent between marker techniques. This can most easily be judged from the correlation coefficients r and the corresponding probabilities P. The use of the average Jaccard similarity or standard Nei genetic distance generally yields a weakly correlated summarizing output. Especially, the correlation between AFLP and CAPS is low when the average Jaccard similarity is applied. Significances are low, as can be seen from the P values that are seldom below the 1% level. In contrast, with the Sax,y a good correlation between summarizing output generated from the different marker techniques was obtained: r was always above 0.80 and, more importantly, P is within the 1% level (or close to 1% for the comparison CAPS to STMS).


View this table:
[in this window]
[in a new window]

 
Table 9. Comparison of summarizing methods for a marker technique by standardized Mantel's statistics g (also expressed as a correlation coefficient r) between similarity matrices. P value was estimated by 1000 random iterations (out of 40,320 possible permutations).{dagger}

 
In addition to the issue raised above, which is relevant when data from different marker origins are to be compared, it is essential to find out which marker technique is the least constrained by different final data analysis. This can be determined from Table 9, in which matrices are compared for the three molecular methods. In general, the correlations for the different methods are high within a marker technique (Table 9). For the summarizing output generated from the three listed methods for CAPS and STMS datasets, the significances of the correlations as seen from P are within the 1% level. For the AFLP dataset, there seems to be a less correlated output when the standard Nei genetic distance is being applied.


    CONCLUSIONS
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
In this paper, we evaluated a number of statistical analyses to identify and characterize sugar beet varieties. As stated before by Manel et al. (2005) in an evaluation of different assignment methods to match different biological questions, it is currently, from a theoretical background, often not possible to say with certainty which of the statistical methods perform best and under which conditions. This strengthens the importance of the present comparative analyses.

A first type of analysis techniques mainly focuses on the differences between varieties and largely ignores the within-variety variation. A commonly used method is the use of marker frequency data. From Table 4 it can be seen that genetic distances between varieties exist; depending on the specific marker dataset, they are more or less significant when compared to the standard errors on them. However, the Mantel's analyses (Table 8), indicate rather low correlation coefficients between distance matrices.

When data reduction techniques, such as bidimensional scaling and clustering, are used, an acceptable level of significance was reached for only a limited group of clusters: two clusters in the case of AFLP data and one cluster with the STMS data set (Fig. 1). Taking into account that this study only included eight varieties as an input for clustering, but fingerprinted 30 plants per variety, the results confirmed across the three marker techniques using this approach are rather poor.

AMOVA quantified the within populations variation as ranging from 84% for STMS to 92% for AFLP markers. However, the population pairwise F-statistics matrices generated by the AMOVA routine in Arlequin did not outperform the simple DNei or DEucl calculated from the direct marker frequency data.

A final set of analyses was based on assignment tests offering the advantage of making use of the multilocus genotype of each individual. The method of Paetkau et al. (1995) yielded assignments that were too unambiguous (Table 6). The correlation between the various marker techniques was high (r > 0.95) but, unfortunately, in its derived output much of this equivalence disappeared as both the level and ranges of the derived distance dx,y were inconsistent, shown by low Mantel's statistics (Tables 6 and 8).

An alternative assignment test based on the pairwise similarity data for individual plants, as introduced by De Riek et al. (2001) for AFLP data, yielded assignments with a more dispersed allocation pattern among varieties than the method by Paetkau et al. (1995) also showing a higher consistency across the marker techniques used (Tables 7 and 8). A good allocation to the proper variety was obtained, together with a reliable allocation pattern among the other varieties. These two aspects represent the variation within a variety and the distance to other varieties. Although asymmetric in its output, we have shown that it can easily be transformed into an average similarity measure (Sax,y). This index based on assignment tests can be considered as a new genetic distance measure with interesting properties:

  1. The assignment tests revealed differences among varieties by the allocation pattern among the other varieties. In particular, this is relatively independent of the marker technique used.
  2. The assignments based on the same marker technique but using a different similarity measure were in good agreement.
  3. The scales and scopes for the distances measured may be values relatively insensitive to the degree of polymorphism of the marker technique used.
  4. The levels of distinction between varieties obtained were much higher (i.e., a higher number of plants is assigned correctly).
  5. The measure produced comparable results when calculated using different numbers of best assigned plants (from the top three to the top 30 highest matches for each plant sampled).

As a similarity by assignment is by its nature related to the composition of the dataset (the varieties it is compared with in the assignment test) and to the assignment thresholds imposed (the number of most related plants each individual is attributed to) it should not be treated as an absolute estimate of genetic distance. However, compared to the other analysis techniques in this study, it accomplishes a superior distinction among genetically diverse varieties in a complex cross-pollinating, polyploid crop such as sugar beet. To our knowledge, this is the first time assignment methods were used for variety identification.


    ACKNOWLEDGMENTS
 
The authors wish to thank the KBIVB-IRBAB for the good cooperation and Laurence Desmet, Veerle Buysens, and Veerle Cools for their efforts in doing DNA preparations and microsatellite analysis. The research was funded by the former Belgian Ministry of Small Enterprises, Traders and Agriculture DG4 and by the Netherlands' Ministry of Agriculture, Nature and Food Safety through the Method Development scheme of CGN.


    NOTES
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 
All rights reserved. No part of this periodical may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Permission for printing and for reprinting the material contained herein has been obtained by the publisher.

Received for publication May 2, 2006.


    REFERENCES
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 REFERENCES
 




This article has been cited by other articles:


Home page
Crop Sci.Home page
J. Wang, M. P. Dobrowolski, N. O.I. Cogan, J. W. Forster, and K. F. Smith
Assignment of Individual Genotypes to Specific Forage Cultivars of Perennial Ryegrass Based on SSR Markers
Crop Sci., January 28, 2009; 49(1): 49 - 58.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by De Riek, J.
Right arrow Articles by Vosman, B.
Right arrow Search for Related Content
PubMed
Right arrow Articles by De Riek, J.
Right arrow Articles by Vosman, B.
Agricola
Right arrow Articles by De Riek, J.
Right arrow Articles by Vosman, B.
Related Collections
Right arrow Biometrics
Right arrow Sugarbeet
Right arrow Data Management


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
The SCI Journals Agronomy Journal Vadose Zone Journal
Journal of Natural Resources
and Life Sciences Education
Soil Science Society of America Journal
Journal of Plant Registrations Journal of
Environmental Quality
The Plant Genome