RIO: Analyzing proteomes by automated phylogenomics using resampled inference of orthologs
Open Access
- 16 May 2002
- journal article
- research article
- Published by Springer Nature in BMC Bioinformatics
- Vol. 3 (1) , 14
- https://doi.org/10.1186/1471-2105-3-14
Abstract
When analyzing protein sequences using sequence similarity searches, orthologous sequences (that diverged by speciation) are more reliable predictors of a new protein's function than paralogous sequences (that diverged by gene duplication). The utility of phylogenetic information in high-throughput genome annotation ("phylogenomics") is widely recognized, but existing approaches are either manual or not explicitly based on phylogenetic trees. Here we present RIO (Resampled Inference of Orthologs), a procedure for automated phylogenomics using explicit phylogenetic inference. RIO analyses are performed over bootstrap resampled phylogenetic trees to estimate the reliability of orthology assignments. We also introduce supplementary concepts that are helpful for functional inference. RIO has been implemented as Perl pipeline connecting several C and Java programs. It is available at http://www.genetics.wustl.edu/eddy/forester/ . A web server is at http://www.rio.wustl.edu/ . RIO was tested on the Arabidopsis thaliana and Caenorhabditis elegans proteomes. The RIO procedure is particularly useful for the automated detection of first representatives of novel protein subfamilies. We also describe how some orthologies can be misleading for functional inference.Keywords
This publication has 45 references indexed in Scilit:
- Automatic clustering of orthologs and in-paralogs from pairwise species comparisonsJournal of Molecular Biology, 2001
- A simple algorithm to infer gene duplication and speciation events on a gene treeBioinformatics, 2001
- Chemosensory signaling in C. elegansBioEssays, 1999
- Evidence for a clade of nematodes, arthropods and other moulting animalsNature, 1997
- Quartet Puzzling: A Quartet Maximum-Likelihood Method for Reconstructing Tree TopologiesMolecular Biology and Evolution, 1996
- Basic local alignment search toolJournal of Molecular Biology, 1990
- NoticesCladistics, 1989
- Confidence Limits on Phylogenies: An Approach Using the BootstrapEvolution, 1985
- Evolution of the proteases of blood coagulation and fibrinolysis by assembly from modulesCell, 1985
- NADP-malate dehydrogenase: Photoactivation in leaves of plants with Calvin cycle photosynthesisBiochemical and Biophysical Research Communications, 1971