Prediction of Protein Function Using Protein–Protein Interaction Data
- 1 December 2003
- journal article
- research article
- Published by Mary Ann Liebert Inc in Journal of Computational Biology
- Vol. 10 (6) , 947-960
- https://doi.org/10.1089/106652703322756168
Abstract
Assigning functions to novel proteins is one of the most important problems in the postgenomic era. Several approaches have been applied to this problem, including the analysis of gene expression patterns, phylogenetic profiles, protein fusions, and protein-protein interactions. In this paper, we develop a novel approach that employs the theory of Markov random fields to infer a protein's functions using protein-protein interaction data and the functional annotations of protein's interaction partners. For each function of interest and protein, we predict the probability that the protein has such function using Bayesian approaches. Unlike other available approaches for protein annotation in which a protein has or does not have a function of interest, we give a probability for having the function. This probability indicates how confident we are about the prediction. We employ our method to predict protein functions based on "biochemical function," "subcellular location," and "cellular role" for yeast proteins defined in the Yeast Proteome Database (YPD, www.incyte.com), using the protein-protein interaction data from the Munich Information Center for Protein Sequences (MIPS, mips.gsf.de). We show that our approach outperforms other available methods for function prediction based on protein interaction data. The supplementary data is available at www-hto.usc.edu/~msms/ProteinFunction.Keywords
This publication has 20 references indexed in Scilit:
- Protein InteractionsMolecular & Cellular Proteomics, 2002
- Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometryNature, 2002
- Functional organization of the yeast proteome by systematic analysis of protein complexesNature, 2002
- Machine learning of functional class from phenotype dataBioinformatics, 2002
- A comprehensive two-hybrid analysis to explore the yeast protein interactomeProceedings of the National Academy of Sciences, 2001
- Assessment of prediction accuracy of protein function from protein–protein interaction dataYeast, 2001
- YPDTM, PombePDTM and WormPDTM: model organism volumes of the BioKnowledgeTM Library, an integrated resource for protein informationNucleic Acids Research, 2001
- Knowledge-based analysis of microarray gene expression data by using support vector machinesProceedings of the National Academy of Sciences, 2000
- Cluster analysis and display of genome-wide expression patternsProceedings of the National Academy of Sciences, 1998
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997