An Integrated Probabilistic Model for Functional Prediction of Proteins
- 1 March 2004
- journal article
- conference paper
- Published by Mary Ann Liebert Inc in Journal of Computational Biology
- Vol. 11 (2-3) , 463-475
- https://doi.org/10.1089/1066527041410346
Abstract
We develop an integrated probabilistic model to combine protein physical interactions, genetic interactions, highly correlated gene expression networks, protein complex data, and domain structures of individual proteins to predict protein functions. The model is an extension of our previous model for protein function prediction based on Markovian random field theory. The model is flexible in that other protein pairwise relationship information and features of individual proteins can be easily incorporated. Two features distinguish the integrated approach from other available methods for protein function prediction. One is that the integrated approach uses all available sources of information with different weights for different sources of data. It is a global approach that takes the whole network into consideration. The second feature is that the posterior probability that a protein has the function of interest is assigned. The posterior probability indicates how confident we are about assigning the function to the protein. We apply our integrated approach to predict functions of yeast proteins based upon MIPS protein function classifications and upon the interaction networks based on MIPS physical and genetic interactions, gene expression profiles, tandem affinity purification (TAP) protein complex data, and protein domain information. We study the recall and precision of the integrated approach using different sources of information by the leave-one-out approach. In contrast to using MIPS physical interactions only, the integrated approach combining all of the information increases the recall from 57% to 87% when the precision is set at 57%—an increase of 30%.Keywords
This publication has 35 references indexed in Scilit:
- Transitive functional annotation by shortest-path analysis of gene expression dataProceedings of the National Academy of Sciences, 2002
- Prediction of Human Protein Function from Post-translational Modifications and Localization FeaturesJournal of Molecular Biology, 2002
- Comparative assessment of large-scale data sets of protein–protein interactionsNature, 2002
- Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometryNature, 2002
- Functional organization of the yeast proteome by systematic analysis of protein complexesNature, 2002
- Interrelating Different Types of Genomic Data, from Proteome to Secretome: 'Oming in on FunctionGenome Research, 2001
- A comprehensive two-hybrid analysis to explore the yeast protein interactomeProceedings of the National Academy of Sciences, 2001
- A Bayesian system integrating expression data with sequence patterns for localizing proteins: comprehensive application to the yeast genome 1 1Edited by F. CohenJournal of Molecular Biology, 2000
- The relationship between protein structure and function: a comprehensive survey with application to the yeast genomeJournal of Molecular Biology, 1999
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997