|
|
||||||||
a Dep. of Agronomy and Plant Genetics, Univ. of Minnesota, 411 Borlaug Hall, 1991 Upper Buford Circle, St. Paul, MN 55108
b Apex Motion Control, Inc., 15691 92 Ave., Surrey, BC, Canada V49 3C3
* Corresponding author (ehlke001{at}umn.edu)
| ABSTRACT |
|---|
|
|
|---|
Abbreviations: AFLP, amplified fragment length polymorphism RMU, relative migration unit
| INTRODUCTION |
|---|
|
|
|---|
Fluorescence-based AFLP can be performed with an automated DNA sequencer, such as the ABI Prism 377 (Applied Biosystems, Foster City, CA)1, to run gels. Output from the sequencer is processed by ABI Prism Genescan Analysis Software, which detects DNA fragments as peaks in an electropherogram. Because a size standard is loaded in each lane, fragment sizing is accurate, but variability remains. Typical differences in fragment sizing are described by Mitchell et al. (1997), who showed that identical simple sequence repeat fragments were sized within a range of 0.17 base pairs within a gel and 0.46 base pairs across gels. Because of this imprecision in fragment sizing, fragments are often sized as intermediate between whole base pairs. Therefore, some researchers have opted for the term relative migration units (RMUs) rather than base pairs (O'Hanlon et al., 1999). We find this terminology to be more accurate, and will therefore use it throughout the paper.
Because of the inherent variability in fragment size calling, categories defined by a midpoint and some level of tolerance (e.g., 55 ± 0.25 RMU) must be created for each fragment. Fragments in lanes can be scored automatically for each category as being present (one) or absent (zero). Creating these categories accurately and rapidly is currently one of the most time consuming steps of fluorescence-based AFLP. This is particularly true in genetic diversity studies where many markers are being scored across numerous individuals.
Two main approaches to creating categories have been used: graphical and text-based. The graphical approach uses software such as ABI Prism Genotyper or Genographer (Benham et al., 1999). These software packages allow the user to visualize the data collected by the sequencer and manually create categories for fragments. The advantage to this approach is that the researcher is assured that the categories reflect the researcher's best judgment. The disadvantages are that the process is time consuming, the process is difficult to perform with large numbers of genotypes due to limited computer screen sizes, and categories need to be reconstructed when more individuals are added to the data set.
Programs such as Genotyper can create a text output of all the peaks detected in a given lane. Utilizing the text-based data is an attractive alternative to the graphical approach because it should allow for uniform size calling without the need to make visual judgements from gel images. However, setting up accurate categories has proved to be problematic. McGregor et al. (2000) created categories every one RMU by rounding all fragment sizes (peak location in RMUs) to the nearest whole number. The advantages of this approach are simplicity and speed. The disadvantage is that some fragments may actually be centered between two RMUs. When rounding to the nearest whole number, these fragments will then be divided into two categories rather than be included in a single category. Therefore, this approach is detrimental to the repeatability of the data, is seldom used and is advised against by Smith (1995).
Another text-based technique for creating categories is the histogram method. With this technique, a list of all the detected peaks (in RMUs) is placed in a spreadsheet such as Microsoft Excel. The histogram function is used to determine how many peaks are in each bin (bin sizes are typically set to 0.1 RMUs). By looking at the histogram, the user can determine the optimal location of categories with a reasonable degree of accuracy. These categories can be created manually, and a binary table can be generated in a software package such as AFLPapp (available from http://hordeum.oscs.montana.edu/software/AFLPapp/; verified February 14, 2002). Although this technique can readily be used with large numbers of genotypes, it still requires time for manual category creation and recreation of categories when genotypes are added, and repeatability can be low because of variability in fragment size calling.
In summary, previous approaches to creating categories for semiautomated fluorescence-based AFLP have been time consuming, difficult to use with numerous genotypes, or have lacked an acceptable degree of accuracy. To deal with these problems, we have developed a software tool (Peakmatcher) that utilizes text-based output from Genotyper software to create rapidly optimal categories and subsequently generate a binary table for the presence or absence of every fragment for each genotype.
| Description of the Program |
|---|
|
|
|---|
The unique approach used in Peakmatcher is to identify the best categories primarily on the basis of repeatability. Therefore, the program requires two or more replications of each genotype. Optimally, the replications would be generated by repeating the AFLP technique and running the products on separate gels. Running the same product in multiple lanes on the DNA sequencer would be an inexpensive means to obtain replications, but would only account for variation in fragment sizing, not variation in the AFLP reactions.
To run Peakmatcher, a list of all detected peaks in each lane must be generated from Genescan files by means of a software package such as Genotyper. Each list is entered into a separate row or column in an Excel spreadsheet and identified by replication number and genotype. When Peakmatcher is run, the user selects the data to analyze, enters values for several user-defined settings, and initiates the analysis.
The internal operation of Peakmatcher consists of two major processes. The first is the generation of marker categories according to user preferences. The user specifies one or more ranges to use and the interval to use between the midpoints of the ranges. Categories of every range requested are then generated with midpoints at the specified interval. Typically, thousands of categories are generated in this step. Within each category, each genotype is scored as having a peak or peaks present in all replications (present), some replications (not repeatable), or no replications (absent).
The second major operation Peakmatcher performs is the elimination of undesirable categories until only categories containing useful, highly repeatable markers remain. Undesirable categories are removed through a linear process of elimination with seven steps:
|
|
| Evaluation of the Program |
|---|
|
|
|---|
Results of the analyses revealed that when time and accuracy are considered, Peakmatcher was clearly superior to other techniques (Table 3). With Illinois bundleflower, Peakmatcher results were identical to those produced by a graphical method, but required only 11% of the time necessary for graphical analysis. With the quackgrass data set, one less marker was obtained by Peakmatcher than by graphical analysis, but the time required for Peakmatcher analysis was only 9% of the time required for graphical analysis.
|
Although Peakmatcher can expedite analyses with small numbers of individuals as we have demonstrated, its greatest advantage is realized when large numbers of individuals are analyzed. We have used Peakmatcher to analyze successfully data from two separate genetic diversity studies using AFLPs. Both studies included more than 150 individuals, and the analyses were performed in less than one hour. Subsets of the output generated by Peakmatcher from both experiments were crosschecked with Genographer, and the Peakmatcher output was found to be consistently reliable. The speed and accuracy with which Peakmatcher can analyze large data sets should expedite the process of semiautomated fluorescence-based AFLP.
| Availability and Requirements |
|---|
|
|
|---|
Peakmatcher requires a personal computer with Microsoft Windows 98 and Excel 97 or newer to operate. The program has processed data sets containing >150 individuals on computers with 128 megabytes of RAM. Larger data sets are likely to require additional memory.
| NOTES |
|---|
|
|
|---|
1 Names are necessary to report factually on available data; however, the Univ. of Minnesota neither guarantees nor warrants the standard of the product, and the use of the name by the Univ. of Minnesota implies no approval of the product to the exclusion of others that may be suitable. ![]()
Received for publication April 11, 2001.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
K. M. Moncada, N. J. Ehlke, G. J. Muehlbauer, C. C. Sheaffer, D. L. Wyse, and L. R. DeHaan Genetic Variation in Three Native Plant Species across the State of Minnesota Crop Sci., November 7, 2007; 47(6): 2379 - 2389. [Abstract] [Full Text] [PDF] |
||||
![]() |
K.B. Jensen, K.H. Asay, D.A. Johnson, S.R. Larson, B.L. Waldron, and A.J. Palazzo Registration of 'Bozoisky-II' Russian Wildrye Crop Sci., February 24, 2006; 46(2): 986 - 987. [Full Text] [PDF] |
||||
![]() |
K. B. Jensen, S. R. Larson, B. L. Waldron, and K. H. Asay Cytogenetic and Molecular Characterization of Hybrids between 6x, 4x, and 2x Ploidy Levels in Crested Wheatgrass Crop Sci., December 2, 2005; 46(1): 105 - 112. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. B. Jensen, S. R. Larson, B. L. Waldron, and D. A. Johnson Characterization of Hybrids from Induced x Natural Tetraploids of Russian Wildrye Crop Sci., May 27, 2005; 45(4): 1305 - 1311. [Abstract] [Full Text] [PDF] |
||||
![]() |
K.B. Jensen, S.R. Larson, and B.L. Waldron Registration of 'Mustang' Altai Wildrye Crop Sci., May 6, 2005; 45(3): 1168 - 1169. [Full Text] [PDF] |
||||
![]() |
L. R. DeHaan, N. J. Ehlke, C. C. Sheaffer, G. J. Muehlbauer, and D. L. Wyse Illinois Bundleflower Genetic Diversity Determined by AFLP Analysis Crop Sci., January 1, 2003; 43(1): 402 - 408. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| The SCI Journals | Agronomy Journal | Vadose Zone Journal | |||
| Journal of Natural Resources and Life Sciences Education |
Soil Science Society of America Journal | ||||
| Journal of Plant Registrations | Journal of Environmental Quality |
The Plant Genome | |||