Annotation Transfer Between Genomes: Protein–Protein Interologs and Protein–DNA Regulogs
Top Cited Papers
Open Access
- 1 June 2004
- journal article
- research article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 14 (6) , 1107-1118
- https://doi.org/10.1101/gr.1774904
Abstract
Proteins function mainly through interactions, especially with DNA and other proteins. While some large-scale interaction networks are now available for a number of model organisms, their experimental generation remains difficult. Consequently, interolog mapping—the transfer of interaction annotation from one organism to another using comparative genomics—is of significant value. Here we quantitatively assess the degree to which interologs can be reliably transferred between species as a function of the sequence similarity of the corresponding interacting proteins. Using interaction information from Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, and Helicobacter pylori, we find that protein–protein interactions can be transferred when a pair of proteins has a joint sequence identity >80% or a joint E-value –70. (These “joint” quantities are the geometric means of the identities or E-values for the two pairs of interacting proteins.) We generalize our interolog analysis to protein–DNA binding, finding such interactions are conserved at specific thresholds between 30% and 60% sequence identity depending on the protein family. Furthermore, we introduce the concept of a “regulog”—a conserved regulatory relationship between proteins across different species. We map interologs and regulogs from yeast to a number of genomes with limited experimental annotation (e.g., Arabidopsis thaliana) and make these available through an online database at http://interolog.gersteinlab.org. Specifically, we are able to transfer ∼90,000 potential protein–protein interactions to the worm. We test a number of these in two-hybrid experiments and are able to verify 45 overlaps, which we show to be statistically significant.Keywords
This publication has 57 references indexed in Scilit:
- A Protein Interaction Map of Drosophila melanogasterScience, 2003
- BIND: the Biomolecular Interaction Network DatabaseNucleic Acids Research, 2003
- Transcriptional Regulatory Networks in Saccharomyces cerevisiaeScience, 2002
- Comparative assessment of large-scale data sets of protein–protein interactionsNature, 2002
- Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometryNature, 2002
- Functional organization of the yeast proteome by systematic analysis of protein complexesNature, 2002
- Protein complexes take the baitNature, 2002
- Identification of Potential Interaction Networks Using Sequence-Based Searches for Conserved Protein-Protein Interactions or “Interologs”Genome Research, 2001
- Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBFNature, 2001
- Basic Local Alignment Search ToolJournal of Molecular Biology, 1990