|
|
||||||||
a Biotechnology Dep., South African Sugar Association Exp. Stn., Private Bag X02, Mount Edgecombe, 4300, South Africa
b Institute for Plant Biotechnology, University of Stellenbosch, Private Bag X1, Matieland, 7602, South Africa
c Institute for Plant Biotechnology, Univ. of Stellenbosch, Private Bag X1, Matieland, 7602, South Africa
fcb{at}land.sun.ac.za
| ABSTRACT |
|---|
|
|
|---|
Abbreviations: bp, base pair EST, expressed sequence tag NCBI, National Centre for Biotechnology Information SuSy, sucrose synthase pfu, plaque forming units PAM, Point Acceptable Mutation
| INTRODUCTION |
|---|
|
|
|---|
The last decade has seen a rapid proliferation in knowledge about plant and animal genomes through the application of large-scale partial sequencing of anonymous cDNA clones from cDNA libraries and their subsequent identification through homology searches of public databases. This approach, commonly referred to as Expressed Sequence Tag (EST) analysis, has been extensively applied in large-scale cDNA sequencing projects for a variety of both plant and animal species such as humans (Adams et al., 1991, 1992), nematodes (McCombie et al., 1992; Waterston et al., 1992), Arabidopsis [Arabidopsis thaliana (L.) Heynh.] (Newman et al., 1994), and rice (Oryza sativa L.) (Sasaki et al., 1994). These groups have shown that partial cDNA sequences, or ESTs, can be used successfully to identify putative clones for a wide range of gene products. ESTs have been reported both in the literature and public databases for 47 690 rice cDNAs (Uchimaya et al., 1992; Sasaki et al., 1994, dbEST release February 2000), 193090 Arabidopsis cDNAs (Höfte et al., 1993; Newman et al., 1994, dbEST release February 2000), and 55466 maize (Zea mays L.) cDNAs (Keith et al., 1993, dbEST release February 2000). However, the availability of plant ESTs in the public databases is substantially less than that available for animal systems. This results in many plant gene identifications being based upon their sequence similarity to animal rather than plant species. There is a need, therefore, to identify and characterize new plant genes in order to increase the availability of plant genes in the international public databases.
Sugarcane biotechnology research world-wide is focused primarily on two main areas, genetic manipulation and identification of markers. One of the problems associated with genetic manipulation of sugarcane is the lack of homologous gene sequences, especially important for antisense work. Similarly, the lack of known sugarcane genes also has implications for molecular marker programs. The most recently published sugarcane maps have been constructed by means of anonymous restriction fragment length polymorphism (RFLP) and random amplified polymorphic DNA (RAPD) probes as well as heterologous probes from species such as maize, oat (Avena sativa L.), and rice (da Silva et al., 1995; Grivet et al., 1996). The identification of sugarcane genes could thus have significant consequences for sugarcane mapping and genetic manipulation and is therefore of great importance.
As a first step to address this issue, we have prepared cDNA libraries from different tissue types in the sugarcane plant. Here we report on the preliminary analysis of 250 anonymous cDNA clones from a library composed of mRNA isolated from the leaf roll (meristematic region) of the commercial sugarcane cultivar NCo376. This work will make a significant contribution towards sugarcane biotechnology.
| Materials and methods |
|---|
|
|
|---|
Construction of a Leaf Roll cDNA Library
cDNA Synthesis
First-strand cDNA synthesis was performed according to a modification of the method described in the Promega Protocols and Applications Guide (1990). Approximately 1 µg of poly (A+) RNA was used in a first-strand synthesis reaction catalyzed by the RNase H- M-Mulv (Moloney-Murine Leukemia Virus) reverse transcriptase enzyme (Stratagene, La Jolla, CA) with oligo d(T)18 as the primer. Final reaction conditions for first-strand synthesis were as follow: 1 µg mRNA; 0.5 µg/µg mRNA of oligo d(T)18; 50 mM Tris-HCl, pH 8.3; 75 mM KCl; 3 mM MgCl2; 10 mM DTT; 1 mM each of dATP, dCTP, dGTP, dTTP; 1.6 u/µL ribonuclease inhibitor; 50 u/µg mRNA of RNase H- M-Mulv reverse transcriptase. The reaction was incubated at 37°C for 1 h. Second-strand synthesis was performed directly following first-strand synthesis and proceeded according to the method described in the Promega Protocols and Applications Guide (1990). Components for the second-strand synthesis reaction were added directly to the same tube following first-strand synthesis. Final reaction conditions for second-strand synthesis were 50 mM Tris-HCl (pH 7.6); 100 mM KCl; 5 mM MgCl2; 5 mM DTT; 0.1 mM NAD; 10 mM (NH4)2SO4; 8 u/mL RNase H; 230 u/mL DNA polymerase 1; 5 u/mL E. coli DNA ligase; 50 µg/mL BSA; 0.2 mM each of dATP, dCTP, dGTP, dTTP from first-strand reaction. The reaction was incubated at 14°C for 2 h. After heat inactivation (70°C, 10 min), second-strand synthesis was completed by the addition of T4 DNA polymerase (2 u/µg mRNA) and incubated for 10 min at 37°C. The ds cDNA product was phenol:chloroform extracted and purified through a QIAquick Spin column (Qiagen GambH, Hilden, Germany) according to the manufacturer's instructions. cDNA was ethanol precipitated prior to ligation to amplification adaptors.
Ligation to Amplification Adaptors
cDNA was blunt-end ligated to an annealed amplification adaptor set (Jepson et al., 1991). This adaptor set consisted of the following two oligonucleotides:
![]() |
![]() |
Ligation was allowed to proceed overnight at 14°C. After ligation, cDNA was size fractionated through a Quick-Spin, Linkers 6 column (Roche Molecular Biochemicals, Indianapolis, IN).
PCR Amplification of cDNA
Ligated, size fractionated cDNA was PCR amplified by means of Oligonucleotide 1 as the primer. The final reaction conditions were as follows: 1x Taq DNA Polymerase buffer [50 mM KCl, 10 mM Tris-HCl (pH 9.0), 0.1% (v/v) Triton X-100]; 600 ng Oligonucleotide 1; 1.25 mM each dideoxynucleotide triphosphates (dNTPs); 3.5 mM MgCl2; 1 unit Taq DNA polymerase; 1 µL ds cDNA template. PCR amplification was performed in a Hybaid OmniGene Thermal Cycler (OmniGene Bioproducts, Inc., Cambridge, MA) under the following conditions: 1 cycle at 73°C for 1 min, followed by 35 cycles of 94°C, 0.8 min; 68°C, 1.1 min; 73°C, 3.0 min. An aliquot of each amplified cDNA sample was analyzed on a 1.5% (w/v) agarose gel to confirm that amplification was successful. The remainder was used for cloning.
Library Construction
All individual PCR amplified cDNA samples were pooled and ethanol precipitated. cDNA was digested with 30 units EcoRI for 2.5 h and approximately 150 to 200 ng removed for cloning. cDNA was cloned into the EcoRI site of the Lambda ZAP II cloning vector and packaged according to the manufacturer's instructions (Stratagene, La Jolla, CA).
Template Preparation
Aliquots of the constructed leaf roll library were plated out onto solid NZY medium and single plaques randomly picked and stored in SM buffer [100 mM NaCl, 8 mM MgSO4·7H2O, 20 mM Tris-HCl pH 7.5, 0.01% (w/v) gelatin] at 4°C. The insert sizes of individual recombinant phages were examined by specific PCR amplification by means of the M13 reverse and T7 primers followed by 1.5% (w/v) agarose gel electrophoresis. Templates for the ESTs from the leaf roll library were prepared in two ways. Phagemids [pBluescript SK(-)] plus inserts were excised from individual phages using the ExAssist helper phage system and performed according to the manufacturer's instructions (Stratagene). Individual phagemid clones were plated out onto solid Luria Bertani (LB) medium containing 50 µg/mL ampicillin. For phagemid DNA isolation, a single colony of each clone was removed and inoculated into a 10 mL overnight culture of LB broth containing 50 µg/mL ampicillin. Phagemid DNA was isolated from a 5-mL aliquot of the overnight culture using a Rapid Plasmid Isolation Protocol (Holmes and Quigley, 1981) and purified through QIAquick spin columns (Qiagen). Templates for DNA sequencing were prepared also by specific PCR amplification of cDNA inserts directly from individual phage suspensions in SM buffer by means of the M13 reverse and the T7 primers. Amplified inserts were purified with QIAquick spin columns (Qiagen) prior to sequencing.
Sequencing
Both phagemid and amplified insert cDNA were sequenced by dye terminator cycle sequencing by means of either the Taq DyeDeoxy Terminator Cycle Sequencing kit (PE Applied Biosystems, Foster City, CA), followed by purification through Centri-Sep Spin columns (Princeton Separations, Adelphia, NJ), or the AmpliTaq DNA polymerase, FS ready reaction kit (PE Applied Biosystems). In both cases, all procedures were performed according to the manufacturer's instructions. The M13 Reverse (5') primer was used to generate single-pass partial sequences for all isolated cDNAs. Cycle sequencing was performed in a Hybaid OmniGene Thermal Cycler and sequence analysis was performed with an ABI Prism 310 Genetic Analyzer (PE Applied Biosystems).
Sequence Data Analysis
Sequences were edited manually to remove vector and ambiguous sequences. The EST sequences were compared with the nonredundant protein databases by using the BLASTX (Altschul et al., 1990) e-mail server provided by NCBI (blast@ncbi.nlm.nih.gov). Sequences showing a Point Acceptable Mutation (PAM) 120 similarity score of over 80 were considered homologous proteins for the clones (Altschul et al., 1990) while those with scores below 80 were regarded as showing sequence similarity. The EST was identified as the protein showing the highest score among the candidate proteins.
| Results |
|---|
|
|
|---|
Generation of Expressed Sequence Tags
For generation of the ESTs, only clones with an insert larger than 400 bp were selected for sequencing. Altogether 250 clones were subjected to single-run partial sequencing, 60 of these using plasmid DNA as sequencing template, and the remaining 190 using DNA obtained by specific PCR amplification of insert DNA from recombinant phages using the T7 and M13 reverse primers. The amount of template DNA used per sequencing reaction differed depending on the source. For plasmid-derived DNA, 1 µg of template was used and for PCR amplified DNA, 100 to 200 ng was required. For all sequencing reactions, only the M13 reverse primer (5') was used. As the cDNA library was not a directional library, the orientation of the cDNA inserts was random. This meant that it was not known from which end (5' or 3') the clones had been sequenced. To identify individual clones, each of the edited sequences was translated into all six translational reading frames and compared to the nonredundant protein sequences databases in GenBank. Deduced amino acid sequence homology between a sugarcane EST and a known sequence was deemed significant if the BLASTX PAM 120 similarity score was greater than 80 (Altschul et al., 1990). All sugarcane ESTs have been deposited in the GenBank database for ESTs, dbEST.
Sequencing Template
A small investigation was conducted to determine whether variation occurred in the amino acid sequence homology results when different forms of template DNA were used for sequence analysis. Conventionally, high quality plasmid DNA is the preferred form of template for sequencing reactions. However, the in vivo excision of phagemids from recombinant cDNA clones housed in a
ZAP II vector and the subsequent isolation of phagemid DNA is a time-consuming process which can negatively affect large-scale sequencing efforts. It has been recognized that while direct sequencing of recombinant clones without isolation of plasmid DNA is a favorable alternative, results are often inconsistent. This is because the amount and quality of template DNA generated during PCR amplification of inserts mayvary, which in turn can lead to unreliable results. In this study, a comparison was performed between sequencing results obtained using template DNA derived either from recombinant plasmids or PCR-amplified cDNA inserts from recombinant phages. Four different clones were selected arbitrarily. All sequencing reactions and sequence analysis were performed at the same time to minimize experimental error. It is evident that the length of the analyzed sequences is similar, regardless of template source (Table 1)
. After editing of sequences to remove the vector component, a final analyzed sequence length of approximately 400 bp was obtained for both plasmid and PCR-amplified insert DNA.
|
|
|
|
Functional Identification of Sugarcane ESTs
All identified ESTs were categorized into general biochemical and metabolic function (Fig. 1)
. The leaf roll cDNA clones exhibited homology to a broad diversity of genes, including enzymes and proteins associated with ubiquitous metabolic pathways, structural proteins, and components of transcriptional and translational apparatus. The largest number of clones (35%) was found to encode many proteins as yet uncharacterized. There are several high-throughput gene sequencing programs currently in progress and many expressed sequences deposited in the GenBank databases by these groups do not yet have an identity. This results in many putative identities to unknown or hypothetical proteins. Of the remaining 65% of clones that were identified, 12.4% were enzymes. Sucrolytic enzymes were the most common, with nine clones representing six different enzymes being identified. These included key regulatory enzymes such as SuSy (AA080580, AA080610, AA080634, AA269294) and triose phosphate isomerase (AA577653). Several other metabolic pathways were represented including the citric acid cycle, fatty acid metabolism, anaerobic metabolism, and amino acid biosynthesis. A further 10.8% of ESTs were involved in protein modification and 9.7% in protein synthesis. These included eight different ribosomal proteins, represented by 10 individual clones, and a variety of protein kinases.
|
| Discussion |
|---|
|
|
|---|
During cDNA library construction, it is assumed that all cDNAs present are equally likely to be cloned. The relative frequency of cDNAs in sugarcane leaf roll tissue would therefore reflect the steady-state levels of the mRNA in the leaf roll. Thus the analysis of cDNA abundance may not only identify fundamental housekeeping genes, but also tissue-specific genes. Because of the small sample size of 250 clones in this study, random sequencing resulted primarily in the identification of genes belonging to the superabundant and abundant classes. To identify rare genes by this approach, it will be necessary to either sequence all the clones in the library, or to prepare a normalized library. However, the high cost both in resources and labor required for large-scale sequencing of total cDNA libraries make it an unpractical option for many small laboratories.
A variety of studies have shown that the composition of clones identified in cDNA libraries reflects the regulation of gene expression related to differentiation, growth condition, or environmental stress. In a recent review of the Rice Genome Project (Yamamoto and Sasaki, 1997), results were presented from EST identification of clones from a variety of tissues subjected to different growth conditions. This research has indicated, for example, that many ribosomal proteins and histone genes were found in growth-phase callus while genes encoding globulin and seed storage proteins such as glutelin and prolamine were identified in ripening panicles. Similarly, in developing castor endosperm a significant proportion of identified clones showed homology to storage proteins or components of the protein biosynthetic apparatus (van de Loo et al., 1995). In this study, the distribution of identified genes between the various metabolic pathways indicated that in sugarcane leaf roll genes involved in protein synthesis, protein modification and glycolysis were the most abundant (Fig. 1). In addition, there was also a significant proportion of genes coding for structural and cell wall proteins. These results probably reflect the high metabolic rate of the leaf roll. In addition, it was not surprising that only one clone was identified as being stress induced (disease resistance protein, RPM1). Because the leaf roll is protected by several leaf sheaths, it is not normally subject to insect or pathogen attack and will therefore not be adversely affected by environmental stresses except under extreme conditions. Some unexpected genes were also detected. Two clones were identified with homologies to a germin-like protein and a stage III sporulation protein, both involved in processes not considered to occur in sugarcane. A similar phenomenon has been observed in maize where proteins involved in nodulation and other processes specifically present in legumes were identified (Shen et al., 1994). These authors suggested that genes with specific functions in some species may have been "borrowed" through evolution to form new genes with different functions, or which simply share some common functional domain.
During the course of the sequencing of the 250 cDNA clones, it was found that several types of clones were identified more than once. It is acknowledged that, compared with many other EST projects, a sample size of 250 is very small. It is also assumed that during the construction of the cDNA library, the PCR amplification of the cDNA was proportional and thus the library is representative of the mRNA pool. On this basis, it may be inferred that the occurrence of multiple copies of specific genes may be indicative of their relative frequency and reflect possible trends in level of expression in the leaf roll. Ten of the ESTs showed similarity to eight different ribosomal proteins (Table 4). Seven of these were large subunit proteins, one was a small subunit protein and it also included two chloroplast ribosomal proteins. This result was not unexpected because of the vigorous growth state of the leaf roll. Ribosomal proteins are fundamental proteins for living systems and are thought to play a specific regulatory role during development. Many ribosomal genes have been identified in growth-phase callus of rice (Yamamoto and Sasaki, 1997) so it seems likely that in sugarcane, ribosomal proteins would be specifically involved in differentiation and growth in the meristematic leaf roll region. Of particular interest in sugarcane, is the identification of clones homologous to the SuSy gene. Expression of SuSy in the leaf roll was found to be quite high (1.6% of total genes identified) compared with 0.6% expression in rice endosperm (Liu et al., 1995). Although the reaction catalyzed by SuSy is readily reversible, there is evidence that it is primarily involved in the breakdown of sucrose (Kruger, 1990). It has been shown that in actively growing tissues where there is high demand for hexose sugars as respiratory substrates, SuSy activity is high (Kruger, 1990). The apparent high expression of SuSy in sugarcane leaf roll could therefore be expected to be primarily related to the breakdown of sucrose in order to meet the demand for respiratory metabolites. The homology search results indicate that all the SuSy ESTs might be from the same expressed gene. However, more research is needed to establish whether this is the case. It is interesting to note that the sugarcane cDNA exhibited the highest homologies to the SuSy gene sequences from dicotyledonous species, despite the presence of SuSy gene sequences from other monocotyledonous plants in the database. The reasons for this observation are not immediately apparent. Other clones that were identified more than once could also be related to the active metabolic state of the leaf roll (Table 4). For example, expression of pectin methylesterase is related to cell wall biosynthesis during cell division. Likewise, 3-oxoacyl-(acyl-carrier protein) reductase expression is essential for cell membrane biosynthesis. Further work aimed at analyzing expression profiles of leaf roll cDNA clones using macroarrays is currently in progress. These results will supplement the trends observed from the random sequencing.
No similar work on the construction of an EST database has yet to be reported for sugarcane. This research has indicated that genes may be easily identified in sugarcane and has provided information about the metabolic state of the leaf roll, independent of the complexity of the sugarcane genome. It has also provided a resource of gene sequence information for sugarcane that may be applied to sugarcane biotechnology research. Further work is underway to develop an EST database for mature internodal tissue, the region in the plant where sucrose accumulation occurs.
| ACKNOWLEDGMENTS |
|---|
| NOTES |
|---|
|
|
|---|
Received for publication November 1, 1999.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
S. L. Dillon, F. M. Shapter, R. J. Henry, G. Cordeiro, L. Izquierdo, and L. S. Lee Domestication to Crop Improvement: Genetic Resources for Sorghum and Saccharum (Andropogoneae) Ann. Bot., October 1, 2007; 100(5): 975 - 989. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Glassop, U. Roessner, A. Bacic, and G. D. Bonnett Changes in the Sugarcane Metabolome with Stem Development. Are They Related to Sucrose Accumulation? Plant Cell Physiol., April 1, 2007; 48(4): 573 - 584. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. L. Vettore, F. R. da Silva, E. L. Kemper, G. M. Souza, A. M. da Silva, M. I. T. Ferro, F. Henrique-Silva, E. A. Giglioti, M. V.F. Lemos, L. L. Coutinho, et al. Analysis and Functional Annotation of an Expressed Sequence Tag Collection for Tropical Crop Sugarcane Genome Res., December 1, 2003; 13(12): 2725 - 2735. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. A. Watt Aluminium-responsive genes in sugarcane: identification and analysis of expression under oxidative stress J. Exp. Bot., April 1, 2003; 54(385): 1163 - 1174. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| The SCI Journals | Agronomy Journal | Vadose Zone Journal | |||
| Journal of Natural Resources and Life Sciences Education |
Soil Science Society of America Journal | ||||
| Journal of Plant Registrations | Journal of Environmental Quality |
The Plant Genome | |||