Choosing negative examples for the prediction of protein-protein interactions
Open Access
- 20 March 2006
- journal article
- Published by Springer Nature in BMC Bioinformatics
- Vol. 7 (S1) , S2
- https://doi.org/10.1186/1471-2105-7-s1-s2
Abstract
The protein-protein interaction networks of even well-studied model organisms are sketchy at best, highlighting the continued need for computational methods to help direct experimentalists in the search for novel interactions. This need has prompted the development of a number of methods for predicting protein-protein interactions based on various sources of data and methodologies. The common method for choosing negative examples for training a predictor of protein-protein interactions is based on annotations of cellular localization, and the observation that pairs of proteins that have different localization patterns are unlikely to interact. While this method leads to high quality sets of non-interacting proteins, we find that this choice can lead to biased estimates of prediction accuracy, because the constraints placed on the distribution of the negative examples makes the task easier. The effects of this bias are demonstrated in the context of both sequence-based and non-sequence based features used for predicting protein-protein interactions.Keywords
This publication has 28 references indexed in Scilit:
- Kernel methods for predicting protein-protein interactionsBioinformatics, 2005
- eBLOCKs: enumerating conserved protein blocks to achieve maximal sensitivity and specificityNucleic Acids Research, 2004
- Information assessment on predicting protein-protein interactionsBMC Bioinformatics, 2004
- Learning to predict protein–protein interactions from protein sequencesBioinformatics, 2003
- On the number of protein-protein interactions in the yeast proteomeNucleic Acids Research, 2003
- Inferring Domain–Domain Interactions From Protein–Protein InteractionsGenome Research, 2002
- Protein InteractionsMolecular & Cellular Proteomics, 2002
- Correlated sequence-signatures as markers of protein-protein interactionJournal of Molecular Biology, 2001
- BIND--The Biomolecular Interaction Network DatabaseNucleic Acids Research, 2001
- Detecting Protein Function and Protein-Protein Interactions from Genome SequencesScience, 1999