Leveraging skewed transcript abundance by RNA-Seq to increase the genomic depth of the tree of life
- 4 January 2010
- journal article
- research article
- Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences
- Vol. 107 (4) , 1476-1481
- https://doi.org/10.1073/pnas.0910449107
Abstract
Assembling the tree of life is a major goal of biology, but progress has been hindered by the difficulty and expense of obtaining the orthologous DNA required for accurate and fully resolved phylogenies. Next-generation DNA sequencing technologies promise to accelerate progress, but sequencing the genomes of hundreds of thousands of eukaryotic species remains impractical. Eukaryotic transcriptomes, which are smaller than genomes and biased toward highly expressed genes that tend to be conserved, could potentially provide a rich set of phylogenetic characters. We sampled the transcriptomes of 10 mosquito species by assembling 36-bp sequence reads into phylogenomic data matrices containing hundreds of thousands of orthologous nucleotides from hundreds of genes. Analysis of these data matrices yielded robust phylogenetic inferences, even with data matrices constructed from surprisingly few sequence reads. This approach is more efficient, data-rich, and economical than traditional PCR-based and EST-based methods and provides a scalable strategy for generating phylogenomic data matrices to infer the branches and twigs of the tree of life.Keywords
This publication has 55 references indexed in Scilit:
- Evidence for an ancient adaptive episode of convergent molecular evolutionProceedings of the National Academy of Sciences, 2009
- RNA-Seq: a revolutionary tool for transcriptomicsNature Reviews Genetics, 2009
- Resolving Arthropod Phylogeny: Exploring Phylogenetic Signal within 41 kb of Protein-Coding Nuclear Gene SequenceSystematic Biology, 2008
- Accurate whole human genome sequencing using reversible terminator chemistryNature, 2008
- Mapping and quantifying mammalian transcriptomes by RNA-SeqNature Methods, 2008
- RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed modelsBioinformatics, 2006
- Initial sequencing and analysis of the human genomeNature, 2001
- Multiple Comparisons of Log-Likelihoods with Applications to Phylogenetic InferenceMolecular Biology and Evolution, 1999
- Gene Trees in Species TreesSystematic Biology, 1997