Expressed Sequence Tags With cDNA Termini: Previously Overlooked Resources for Gene Annotation and Transcriptome Exploration in Chlamydomonas reinhardtii
- 1 May 2008
- journal article
- research article
- Published by Oxford University Press (OUP) in Genetics
- Vol. 179 (1) , 83-93
- https://doi.org/10.1534/genetics.107.085605
Abstract
Many of Chlamydomonas reinhardtii expressed sequence tags (ESTs) in GenBank dbEST and community EST assemblies were either over- or undertrimmed in terms of their cDNA termini, which are defined as the diagnostic sequence elements that delineate 3'/5' ends of mRNA transcripts. Overtrimming represents a loss of directional, positional, and structural information of transcript ends whereas undertrimming causes unclean spurious sequences retained in ESTs that exert deleterious impacts on downstream EST-based applications. We examined 309,278 raw EST sequencing trace files of C. reinhardtii and found that only 57% had cDNA termini that matched the expected structures specified in their cDNA library constructions while satisfying our minimum length requirement for their final clean sequences. Using GMAP, 156,963 individual ESTs were mapped to the genome successfully, with their in silico-verified cDNA termini anchored to the genome. Our data analysis suggested strong macro- and microheterogeneity of 3'/5' end positions of individual transcripts derived from the same genes in C. reinhardtii. This work annotating differential ends of individual transcripts in the draft genome presents the research community with a new stream of data that will facilitate accurate determination of gene structures, genome annotation, and exploration of the transcriptome and mRNA metabolism in C. reinhardtii.Keywords
This publication has 33 references indexed in Scilit:
- Unique Features of Nuclear mRNA Poly(A) Signals and Alternative Polyadenylation in Chlamydomonas reinhardtiiGenetics, 2008
- An optimized procedure greatly improves EST vector contamination removalBMC Genomics, 2007
- The Chlamydomonas Genome Reveals the Evolution of Key Animal and Plant FunctionsScience, 2007
- WebTraceMiner: a web service for processing and mining EST sequence trace filesNucleic Acids Research, 2007
- EST assembly supported by a draft genome sequence: an analysis of the Chlamydomonas reinhardtii transcriptomeNucleic Acids Research, 2007
- ConiferEST: an integrated bioinformatics system for data reprocessing and mining of conifer expressed sequence tags (ESTs)BMC Genomics, 2007
- Robust analysis of 5′-transcript ends (5′-RATE): a novel technique for transcriptome analysis and genome annotationNucleic Acids Research, 2006
- Identification of common molecular subsequencesPublished by Elsevier ,2004
- BLAT—The BLAST-Like Alignment ToolGenome Research, 2002
- BLAST 2 Sequences, a new tool for comparing protein and nucleotide sequencesFEMS Microbiology Letters, 1999