Prospects for Building the Tree of Life from Large Sequence Databases
- 12 November 2004
- journal article
- other
- Published by American Association for the Advancement of Science (AAAS) in Science
- Vol. 306 (5699) , 1172-1174
- https://doi.org/10.1126/science.1102036
Abstract
We assess the phylogenetic potential of ∼300,000 protein sequences sampled from Swiss-Prot and GenBank. Although only a small subset of these data was potentially phylogenetically informative, this subset retained a substantial fraction of the original taxonomic diversity. Sampling biases in the databases necessitate building phylogenetic data sets that have large numbers of missing entries. However, an analysis of two “supermatrices” suggests that even data sets with as much as 92% missing data can provide insights into broad sections of the tree of life.Keywords
This publication has 18 references indexed in Scilit:
- Phylogenomics of Eukaryotes: Impact of Missing Data on Large AlignmentsMolecular Biology and Evolution, 2004
- Genome-scale approaches to resolving incongruence in molecular phylogeniesNature, 2003
- The challenge of constructing large phylogenetic treesTrends in Plant Science, 2003
- Obtaining Maximal Concatenated Phylogenetic Data Sets from Large Sequence DatabasesMolecular Biology and Evolution, 2003
- Extracting Species Trees From Complex Gene Trees: Reconciled Trees And Vertebrate PhylogenyMolecular Phylogenetics and Evolution, 2000
- A few logs suffice to build (almost) all trees (I)Random Structures & Algorithms, 1999
- Phylogenetic supertrees: Assembling the trees of lifePublished by Elsevier ,1998
- Angiosperm Phylogeny Inferred from 18S Ribosomal DNA SequencesAnnals of the Missouri Botanical Garden, 1997
- Inferring complex phytogeniesNature, 1996
- The guinea-pig is not a rodentNature, 1996