ASSESSING AND COMBINING RELIABILITY OF PROTEIN INTERACTION SOURCES
Open Access
- 1 December 2006
- proceedings article
- Published by World Scientific Pub Co Pte Ltd in Pacific Symposium on Biocomputing
Abstract
Integrating diverse sources of interaction information to create protein networks requires strategies sensitive to differences in accuracy and coverage of each source. Previous integration approaches calculate reliabilities of protein interaction information sources based on congruity to a designated 'gold standard.' In this paper, we provide a comparison of the two most popular existing approaches and propose a novel alternative for assessing reliabilities which does not require a gold standard. We identify a new method for combining the resultant reliabilities and compare it against an existing method. Further, we propose an extrinsic approach to evaluation of reliability estimates, considering their influence on the downstream tasks of inferring protein function and learning regulatory networks from expression data. Results using this evaluation method show 1) our method for reliability estimation is an attractive alternative to those requiring a gold standard and 2) the new method for combining reliabilities is less sensitive to noise in reliability assignments than the similar existing technique.Keywords
This publication has 29 references indexed in Scilit:
- IntAct: an open source molecular interaction databaseNucleic Acids Research, 2004
- The Pfam protein families databaseNucleic Acids Research, 2004
- A Bayesian framework for combining heterogeneous data sources for gene function prediction (inSaccharomyces cerevisiae)Proceedings of the National Academy of Sciences, 2003
- How Reliable are Experimental Protein–Protein Interaction Data?Published by Elsevier ,2003
- Transcriptional Regulatory Networks in Saccharomyces cerevisiaeScience, 2002
- Bridging structural biology and genomics: assessing protein interaction data with known complexesTrends in Genetics, 2002
- Protein InteractionsMolecular & Cellular Proteomics, 2002
- GenMAPP, a new tool for viewing and analyzing microarray data on biological pathwaysNature Genetics, 2002
- DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactionsNucleic Acids Research, 2002
- Predictome: a database of putative functional links between proteinsNucleic Acids Research, 2002