Statistical analysis of domains in interacting protein pairs
Open Access
- 27 October 2004
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 21 (7) , 993-1001
- https://doi.org/10.1093/bioinformatics/bti086
Abstract
Motivation: Several methods have recently been developed to analyse large-scale sets of physical interactions between proteins in terms of physical contacts between the constituent domains, often with a view to predicting new pairwise interactions. Our aim is to combine genomic interaction data, in which domain–domain contacts are not explicitly reported, with the domain-level structure of individual proteins, in order to learn about the structure of interacting protein pairs. Our approach is driven by the need to assess the evidence for physical contacts between domains in a statistically rigorous way. Results: We develop a statistical approach that assigns p-values to pairs of domain superfamilies, measuring the strength of evidence within a set of protein interactions that domains from these superfamilies form contacts. A set of p-values is calculated for SCOP superfamily pairs, based on a pooled data set of interactions from yeast. These p-values can be used to predict which domains come into contact in an interacting protein pair. This predictive scheme is tested against protein complexes in the Protein Quaternary Structure (PQS) database, and is used to predict domain–domain contacts within 705 interacting protein pairs taken from our pooled data set. Contact:thomas.nye@mrc-bsu.cam.ac.ukKeywords
This publication has 21 references indexed in Scilit:
- A structural perspective on protein–protein interactionsCurrent Opinion in Structural Biology, 2004
- Structure-Based Assembly of Protein Complexes in YeastScience, 2004
- Integrative approach for computationally inferring protein domain interactionsBioinformatics, 2003
- Assembly of Cell Regulatory Systems Through Protein Interaction DomainsScience, 2003
- Comparative assessment of large-scale data sets of protein–protein interactionsNature, 2002
- Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structureJournal of Molecular Biology, 2001
- Correlated sequence-signatures as markers of protein-protein interactionJournal of Molecular Biology, 2001
- Domain combinations in archaeal, eubacterial and eukaryotic proteomesJournal of Molecular Biology, 2001
- A comprehensive two-hybrid analysis to explore the yeast protein interactomeProceedings of the National Academy of Sciences, 2001
- Random sequencesJournal of Molecular Biology, 1983