Ab initio construction of a eukaryotic transcriptome by massively parallel mRNA sequencing
Open Access
- 3 March 2009
- journal article
- research article
- Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences
- Vol. 106  (9) , 3264-3269
- https://doi.org/10.1073/pnas.0812841106
Abstract
Defining the transcriptome, the repertoire of transcribed regions encoded in the genome, is a challenging experimental task. Current approaches, relying on sequencing of ESTs or cDNA libraries, are expensive and labor-intensive. Here, we present a general approach for ab initio discovery of the complete transcriptome of the budding yeast, based only on the unannotated genome sequence and millions of short reads from a single massively parallel sequencing run. Using novel algorithms, we automatically construct a highly accurate transcript catalog. Our approach automatically and fully defines 86% of the genes expressed under the given conditions, and discovers 160 previously undescribed transcription units of 250 bp or longer. It correctly demarcates the 5′ and 3′ UTR boundaries of 86 and 77% of expressed genes, respectively. The method further identifies 83% of known splice junctions in expressed genes, and discovers 25 previously uncharacterized introns, including 2 cases of condition-dependent intron retention. Our framework is applicable to poorly understood organisms, and can lead to greater understanding of the transcribed elements in an explored genome.Keywords
This publication has 23 references indexed in Scilit:
- Alternative isoform regulation in human tissue transcriptomesNature, 2008
- Mapping short DNA sequencing reads and calling variants using mapping quality scoresGenome Research, 2008
- Isoform discovery by targeted cloning, 'deep-well' pooling and parallel sequencingNature Methods, 2008
- Mapping and quantifying mammalian transcriptomes by RNA-SeqNature Methods, 2008
- Natural history and evolutionary principles of gene duplication in fungiNature, 2007
- A large-scale full-length cDNA analysis to explore the budding yeast transcriptomeProceedings of the National Academy of Sciences, 2006
- A high-resolution map of transcription in the yeast genomeProceedings of the National Academy of Sciences, 2006
- Genome sequence of the human malaria parasite Plasmodium falciparumNature, 2002
- The Human Genome Browser at UCSCGenome Research, 2002
- SGD: Saccharomyces Genome DatabaseNucleic Acids Research, 1998