|
|
||||||||
Dep. of Crop Sciences, Univ. of Illinois at Urbana-Champaign, 332 NSRC, 1101 W. Peabody Drive, Urbana, IL 61801
* Corresponding author (gca{at}uiuc.edu)
| ABSTRACT |
|---|
|
|
|---|
Abbreviations: C-value, the DNA amount in the unreplicated haploid nucleus of an organism and stands for constant GPWG, Grass Phylogeny Working Group ITS, internal transcribed spacer MYA, million years ago PACCAD, panicoids, arundinoids, chloridoids, centothecoids, aristidoids, and danthonioids rRNA, ribosomal RNA SP, squared-change parsimony SRP, signal recognition particle WP, Wagner parsimony
| INTRODUCTION |
|---|
|
|
|---|
Grass species differ widely in morphology and physiology. Similarly, they show striking differences in the size of their genomes, with DNA content ranging from 0.5 to 40 pg DNA per 2C nucleus (Bennett et al., 1982, 2000; Bennett and Smith, 1976, 1991; Bennett and Leitch, 1995, 2003). Besides polyploidization and duplication, genome size differences are embodied in noncoding and repetitive DNA (SanMiguel et al., 1996, 1998). These differences often result from mutational mechanisms of nucleic acid addition and loss, such as transposition (transposable element activity), spontaneous insertions and deletions, and chromosomal rearrangements (Petrov, 2001). For example, genome expansion in Arabidopsis appears to be counteracted by genome reduction through illegitimate recombination (Devos et al., 2002). Genome size differences are important in the context of conservation of gene content and order (synteny and colinearity) and the possibility of extending genomic information directly from one grass species to another using comparative genomic approaches (Devos and Gale, 2000). This has motivated evolutionary studies that trace changes in genome size throughout the history of diversification of the grass family (Bennetzen and Kellogg, 1997; Kellogg, 1998). While these phylogenetic studies have shown that increases in genome size have occurred over evolutionary time, it is not clear if decreases are equally likely (Bennetzen and Kellogg, 1997). Furthermore, establishing the direction of genome size evolution is difficult and often intimately linked to phylogenetic inferences and resolution of deep branches in grass phylogenies.
In this study, well-established methods of character state reconstruction were used to trace evolutionary changes in genome size along the branches of a phylogenetic tree that describes the evolution of major grass lineages. The phylogeny that was used includes grass species with diploid genomes of known DNA content and integrates results from a recent and comprehensive phylogenetic study based on macromorphology, anatomy, biochemistry, and the sequence of chloroplast and nuclear genes (Grass Phylogeny Working Group, 2001] with inferences from RNA structure and large-scale chromosomal rearrangements (Caetano-Anollés, 2005). Different models of character change were used to reconstruct evolution of genome size and evaluate patterns of genome size increase and reduction in these grasses.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Phylogenetic Assumptions
A phylogeny depicting the history of diversification of major lineages of the grass family was assembled from the diploid species selected, and genome size was traced along its branches. Phylogenetic relationships of diploid grasses summarized by Kellogg (1998) were superimposed on segments of the skeletal phylogeny proposed by the Grass Phylogeny Working Group (2001) rerooted in the Ehrhartoideae (Caetano-Anollés, 2005). The Grass Phylogeny Working Group research consortium described the evolutionary relationships of representative species within major grass subfamilies using macromorphology, anatomy, biochemistry, and molecular features such as restriction endonuclease maps of the chloroplast genome and the nucleotide sequence of chloroplast (ndhF, rbcL and rpoC2) and nuclear (phyB, and waxy) genes and intergenic ribosomal RNA (rRNA) spacers, and maximum parsimony methods of tree reconstruction (Grass Phylogeny Working Group, 2001). The Grass Phylogeny Working Group phylogeny combined 8 character sets, some of which support strongly (chloroplast restriction-morphological data), moderately (rbcL), and weakly [chloroplast restriction data and internal transcribed spacer (ITS) rRNA] the existence of a Pooideae and a PACCAD clade. Other character sets were inconclusive or supported other groupings that were slightly more parsimonious, but none included any morphological synapomorphies. Deep branching patterns that reroot the tree in the Ehrhartoideae and support a sister clade relationship between the Pooideae and PACCAD clade were derived directly from geometrical and statistical features describing the structure of signal recognition particle (SRP) RNA, the small subunit of rRNA, enod40 mRNA and ITS rRNA, and large-scale chromosomal rearrangements (insertions, translocations, and instances of chromosomal orthology) using maximum parsimony methods (Caetano-Anollés, 2005). For RNA, molecules were characterized by attributes such as nucleotide length of molecular components, thermodynamic properties such as minimum Gibbs free energy increments, or statistical parameters that describe the stability and uniqueness of folded conformations. Attributes were then treated as linearly ordered multistate characters that were polarized by a model of character state transformation in which structures with increased molecular order and minimum frustration were defined as being ancestral (Caetano-Anollés, 2002a, 2002b). This approach is supported by considerations in statistical mechanics, produces intrinsically rooted trees that "embed structure and function directly into phylogenetic analysis" (Pollock, 2003), and was used successfully to reconstruct a phylogeny of the living world (Caetano-Anollés, 2002a) and study ribosomal evolution (Caetano-Anollés, 2002b). In the analysis of chromosomal rearrangements, phylogenetic reconstruction supports the proposal that genetic linkage blocks that are freestanding in rice were ancestral and that rearrangements that result in translocations, insertions, and redistribution of rice linkage blocks were derived events (Gale and Devos 1998).
Evolutionary Tracing of Genome Size in the Grass Family
Genome size was traced along the individual branches of the phylogenetic tree. Ancestral character states were reconstructed using algorithms for squared-change (Maddison, 1991) and Wagner parsimony (Swofford and Maddison, 1987) in MACCLADE v. 3.08 (Maddison and Maddison, 1999). Squared-change parsimony (SP) minimizes the sum of the squared changes on the branches of the tree and can be considered a Bayesian probability estimate under a Brownian motion model of evolution. Wagner (linear) parsimony (WP) minimizes the sum of the absolute value of changes on the branches of the trees. Because this produces a range of equally parsimonious values, only minimum values were chosen.
| RESULTS |
|---|
|
|
|---|
|
|
|
|
Genome size increased in the phylogenetic tree of the grasses under all models. Genome size values reconstructed at internal nodes showed clear patterns of increase (cf. nodes defining the ancestor of grasses, Pooideae, Bromeae, and Triticeae) and decrease (e.g., in Ehrhartoideae and the PACCAD clade). There were notable genome size changes in the Pooideae [ancestors of Lygeum spartum Loefl. ex L. (two- to sevenfold increase), Agropyron (19-fold increase and 10- to 12-fold decrease), and Brachypodium (four to fivefold decrease)] and Chloridoideae [ancestors of Eragrostis tef (Zucc.) Trott. (fourfold decrease) and Spartina anglica C.E. Hubbard (four- to 22-fold increase)]. Overall patterns of genome size increase were also evident when comparing the number of increases and decreases occurring in individual clades [as increase/decrease ratios (r); Table 1]. In all three models, ratios increased in the order Ehrhartoideae, PACCAD, and Pooideae clades, showing different tendencies in genome size diversification in these major plant groups.
| DISCUSSION |
|---|
|
|
|---|
A correct phylogeny of the grasses defines a direction of change and is critical in our efforts to establish tendencies in genome size evolution (Bennetzen and Kellogg, 1997). In fact, alternative rooting of the major grass subfamilies affect inferences of ancestral genome size when DNA contents (C-values) were traced on a phylogeny of the grasses using SP (see below). Given the deep-branching relationships inferred using the novel comparative approach (Caetano-Anollés, 2005) and the detailed genetic relationships of the Grass Phylogeny Working Group skeletal phylogeny (Grass Phylogeny Working Group, 2001), I assembled a phylogenetic tree of representative grass species and traced the evolution of genome size along its branches using different models of character change (Fig. 13, Table 1). Two kinds of algorithms were used to find ancestral states that were most parsimonious. One minimizes the sum of the absolute values of the changes using a linear Wagner parsimony criterion (Swofford and Maddison, 1987), and the other minimizes the sum of the squared changes using a Brownian motion model of evolution (Maddison, 1991). Using these parsimony criteria, genome size increased and decreased along the lineages of the grass family at varying levels. Alternatively, a unidirectional model that prohibits genome size decreases was used and resulted in higher levels of genome size increase along the tree. The existence of increase-only mechanisms is an unrealistic evolutionary scenario. Genome size reduction explains the excess of sequences flanking BARE-1 retrotransposons in barley as remnants of reductive recombination events (Vicient et al., 1999; Shirasu et al., 2000), and DNA sequencing of a large 211-kb DNA segment in diploid wheat revealed a complex pattern of genome rearrangement, including deletion of large DNA fragments containing retroelements (Wicker et al., 2001). Furthermore, differences in rates of DNA loss appear to be important determinants of genome size evolution in insects (Petrov et al., 2000), and could play a similar role in the grasses. Similarly, decrease-only explanations for genome size evolution are highly unlikely. There is ample evidence that grass genomes have increased in size by chromosomal duplication and the effects of repetitive DNA (SanMiguel et al., 1996, 1998; Bennetzen, 2000; Gaut et al., 2000). For example, the maize genome increased considerably in size by a retrotransposon invasion that began
5.2 MYA (SanMiguel et al., 1998). Similarly, recent and hidden polyploidization events appear to be a widespread phenomenon in the grasses, and both result in genome size increases (Levy and Feldman, 2002).
While recent polyploidy can be easily recognized, many species currently regarded as bona fide diploids could actually represent paleopolyploids, ancient polyploids with disomic inheritance and progenitors that cannot be identified using cytology or DNA markers. In fact, polyploidy could have occurred in the lineage of at least 70% of angiosperms (Masterson, 1994) and appears a revolutionary and ongoing process in the grasses (Levy and Feldman, 2002). The controversial proposal that genome evolution is mainly driven by whole-genome duplication (Ohno, 1970) has been recently used to explain chromosomal and synteny patterns in plants (e.g., angiosperms; Bowers et al., 2003) and fungi (e.g., hemiascomycete fungi; Dujon et al., 2004). Even vertebrates are believed to have experienced two rounds of paleopolyploidization (Wolfe, 2001). However, evidence in favor of paleoploidization events has been criticized as observations could be more parsimoniously explained by local duplication and genomic rearrangements (e.g., Hughes and Friedman, 2003). For example, gene interleaving patterns of synteny in yeast (Saccharomyces cerevisiae) were better explained by segmental duplication and recombination when analyzing the gene complements of entire chromosomes with advanced rearrangement algorithms (N. Martin et al., 2004, unpublished data). Phylogenetic analysis and comparative genomics will ultimately help identify the role of ancient polyploidy in evolution of genome size, as C-values are traced along branches delimiting phylogenetic hypotheses.
Character tracing offers the opportunity to study how changes in genome size distribute along different grass lineages and draw inferences on the genome size of hypothetical ancestors by assigning character states to internal nodes of the trees. Most changes in grass genome size did not exceed twofold increases or decreases, and few exceeded threefold levels of change. This occurred when using the two parsimony models of character evolution. There were clear patterns of increase and decrease in the Ehrhartoideae, Pooideae, and PACCAD clades, and notable genome size changes in the Pooideae and Chloridoideae. There were also notable increases and decreases in genome size occurring within an individual genus (e.g., Agropyron). In fact, this was expected. For example, a clear reduction of genome size has been recently proposed in one of two distinct lineages of Sorghum (Price et al., 2005). Parsimony analysis suggested an ancestor of the grass family with a genome size between 3.0 and 5.2 pg DNA per 2C nucleus (2.9 x 109 and 5.1 x 109 base pairs). This represents a genome that is six to 10 times bigger than the smallest (Oropetium thomaeum Trin.) and five to seven times smaller than the largest (Lygeum spartum) diploid genome described (Bennett and Leitch, 2003). Interestingly, the analysis of a phylogeny of 37 diploid species using the SP method resulted in an ancestral genome size (3.5 pg DNA per 2C nucleus) that was much lower than the estimate obtained in this study (Kellogg, 1998).
Only a few studies have traced genome size evolution along the branches of phylogenetic trees using character reconstruction methods. This is because the exercise requires both robust phylogenetic hypotheses and comprehensive sampling of DNA content among clades, conditions that only recently are beginning to be met. Recent studies suggest that the genome size of ancestors of angiosperms (Soltis et al., 2003) and land plants (Embryophyta) (Leitch et al., 2005) was small (
1.4 pg DNA per 1C nucleus). Most of the major clades within angiosperms (e.g., monocots, magnoliids, eudicots) appear to have also small ancestral genomes, showing many possible instances of genome size increase and decrease in clades that occupy derived positions of the trees (Soltis et al., 2003; Leitch et al., 2005). Within this framework, monocotyledoneous plants appear to exhibit several independent instances of genome size increase in major lineages, but this depends fundamentally on how well phylogenies resolve in each lineage (Leitch et al., 2005). In this regard, the present study confirms an instance of overall genome size increase in a derived lineage of the Commelinids. Clearly, a comprehensive character reconstruction effort will be needed to obtain more accurate estimates of ancestral genome size for internal nodes along the lineages of terrestrial plants and a better picture of genome size evolution.
While the WP method may be too conservative, the SP method may overestimate the ancestral size of a genome. The validity of hypotheses of character change was therefore tested with an example of genome evolution in the Panicoideae. Pennisetum and maize diverged about 29 MYA, followed 9 million years later by the divergence of the two diploid progenitors of maize (Gaut et al., 2000). Genome size of the progenitors of Zea and Tripsacum (that diverged
4.7 MYA) doubled as the result of a segmental allotetraploid event, which occurred sometime between the divergence of sorghum (
6.5 MYA) and the rediploidization of the ancestors of maize (
11 MYA). This allotetraploid event was followed by genome rearrangement and then by a retrotrasposon invasion that began before the split of Zea and Tripsacum (Gaut et al., 2000). Comparison of the retrotransposon-invaded Adh1 region of maize and the retrotransposon-free Adh1 region in barley suggested that the genome size of the ancestor of maize doubled in size (SanMiguel et al., 1998). The genome size of hypothetical ancestors of maize that are common to Pennisetum, Sorghum, and Tripsacum, was 5.3, 6, and 6.9 pg per 2C nucleus when inferred using the SP method (Fig. 1) and 3, 3, and 5.3 pg per 2C nucleus when inferred using the WP method (Fig. 2). Only the WP method accounted for the expected retrotransposon-induced doubling of genome size proposed during the late history of diversification of maize. None of the two methods revealed a major increase resulting from segmental allotetraploidy. The SP method showed only modest increases (1530%) in genome size of ancestors along the maize lineage (starting
29 MYA). In fact, these increases were lower than those observed occurring within species of Zea (3072%). Only the unidirectional character tracing method accounted for increases due to allotetraploidy and retrotransposon proliferation, but this is an unrealistic model that discards mechanisms of genome contraction proposed linked to these same two phenomena.
The results of this study extend early proposals that suggest genome size has both increased and decreased along phylogenetic lineages of the grass family (Bennetzen and Kellogg, 1997). Bidirectional evolution of plant genome size appears a widespread phenomenon. It was recently reported in the cotton tribe Gossypeae (Wendel et al., 2002) and its dynamic nature described in angiosperms (Soltis et al., 2003) and land plants (Leitch et al., 2005). The present study also shows that different models of character evolution imparted different frequencies and levels of change along the branches of the trees. While the SP method favored decreases over increases, the linear WP method minimized the sum of the absolute values of change along branches of the grass tree and produced a more even distribution of genome size. Note that the SP method maximizes the Bayesian posterior probability (probability of an hypothesis given the data) of reconstruction of character states at ancestral nodes when changes are inversely weighted by the length of the branches (Maddison, 1991). However, the method may overestimate changes in the trees. Future efforts to mitigate difficulties in the prediction of genome size of hypothetical ancestors may involve weighting schemes that take into account variation in rates of genome size evolution along branches of the tree.
| NOTES |
|---|
|
|
|---|
Received for publication October 13, 2004.
| REFERENCES |
|---|
|
|
|---|
Related articles in Crop Science:
This article has been cited by other articles:
![]() |
P. Smarda, P. Bures, L. Horova, B. Foggi, and G. Rossi Genome Size and GC Content Evolution of Festuca: Ancestral Expansion and Subsequent Reduction Ann. Bot., February 1, 2008; 101(3): 421 - 433. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Vitte and J. L. Bennetzen Eukaryotic Transposable Elements and Genome Evolution Special Feature: Analysis of retrotransposon structural diversity uncovers properties and propensities in angiosperm genome evolution PNAS, November 21, 2006; 103(47): 17638 - 17643. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| The SCI Journals | Agronomy Journal | Vadose Zone Journal | |||
| Journal of Plant Registrations | Soil Science Society of America Journal | ||||
| Journal of Natural Resources and Life Sciences Education |
Journal of Environmental Quality |
||||