Automatic genome-wide reconstruction of phylogenetic gene trees
Open Access
- 1 July 2007
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 23 (13) , i549-i558
- https://doi.org/10.1093/bioinformatics/btm193
Abstract
Gene duplication and divergence is a major evolutionary force. Despite the growing number of fully sequenced genomes, methods for investigating these events on a genome-wide scale are still in their infancy. Here, we present SYNERGY, a novel and scalable algorithm that uses sequence similarity and a given species phylogeny to reconstruct the underlying evolutionary history of all genes in a large group of species. In doing so, SYNERGY resolves homology relations and accurately distinguishes orthologs from paralogs. We applied our approach to a set of nine fully sequenced fungal genomes spanning 150 million years, generating a genome-wide catalog of orthologous groups and corresponding gene trees. Our results are highly accurate when compared to a manually curated gold standard, and are robust to the quality of input according to a novel jackknife confidence scoring. The reconstructed gene trees provide a comprehensive view of gene evolution on a genomic scale. Our approach can be applied to any set of sequenced eukaryotic species with a known phylogeny, and opens the way to systematic studies of the evolution of individual genes, molecular systems and whole genomes. Contact:aregev@broad.mit.edu Supplementary information: Supplementary data are available at Bioinformatics online.Keywords
This publication has 30 references indexed in Scilit:
- Phylogenetic Reconstruction of Orthology, Paralogy, and Conserved Synteny for Dog and HumanPLoS Computational Biology, 2006
- The Yeast Gene Order Browser: Combining curated homology and syntenic context reveals gene fate in polyploid speciesGenome Research, 2005
- Tree pattern matching in phylogenetic trees: automatic search for orthologs or paralogs in homologous gene sequence databasesBioinformatics, 2005
- The altered evolutionary trajectories of gene duplicatesTrends in Genetics, 2004
- Genome evolution in yeastsNature, 2004
- Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiaeNature, 2004
- OrthoMCL: Identification of Ortholog Groups for Eukaryotic GenomesGenome Research, 2003
- Sequencing and comparison of yeast species to identify genes and regulatory elementsNature, 2003
- CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choiceNucleic Acids Research, 1994
- The rapid generation of mutation data matrices from protein sequencesBioinformatics, 1992