Genome-Wide Prediction of SH2 Domain Targets Using Structural Information and the FoldX Algorithm

Open Access

4 April 2008

journal article
research article
Published by Public Library of Science (PLoS) in PLoS Computational Biology

Vol. 4 (4) , e1000052
https://doi.org/10.1371/journal.pcbi.1000052

Abstract

Current experiments likely cover only a fraction of all protein-protein interactions. Here, we developed a method to predict SH2-mediated protein-protein interactions using the structure of SH2-phosphopeptide complexes and the FoldX algorithm. We show that our approach performs similarly to experimentally derived consensus sequences and substitution matrices at predicting known in vitro and in vivo targets of SH2 domains. We use our method to provide a set of high-confidence interactions for human SH2 domains with known structure filtered on secondary structure and phosphorylation state. We validated the predictions using literature-derived SH2 interactions and a probabilistic score obtained from a naive Bayes integration of information on coexpression, conservation of the interaction in other species, shared interaction partners, and functions. We show how our predictions lead to a new hypothesis for the role of SH2 domains in signaling. Understanding the functional role of every protein in the cell is a long-standing goal of cellular biology. An important step in this direction is to discover how and when proteins interact inside the cell to accomplish their tasks. Many of the cellular functions depend on reversible protein modifications like phosphorylation. To sense these modifications, cells have protein domains capable of binding phosphorylated proteins such as the SH2 domain. In this work, we show that it is possible to use the three-dimensional structure of protein domains to predict its binding preferences. Using a computational tool called FoldX, we have predicted the binding specificity of several human SH2 domains. These predictions, based on the computational analysis of the 3-D structure, were shown to be of similar accuracy as those obtained from experimental binding assays. We show here that it is also possible to understand how a mutation changes the binding preference of protein binding domains, opening the way for better understanding of some disease causing mutations. The combination of this novel computational approach with other sources of information allowed us to provide a set of high-confidence novel interactions for the proteins here studied.

Keywords

This publication has 67 references indexed in Scilit:

A Genome-wide Ras-Effector Interaction Network
Journal of Molecular Biology, 2007
The X-linked lymphoproliferative disease gene product SAP associates with PAK-interacting exchange factor and participates in T cell activation
Proceedings of the National Academy of Sciences, 2006
Reading protein modifications with interaction domains
Nature Reviews Molecular Cell Biology, 2006
Designed tumor necrosis factor-related apoptosis-inducing ligand variants initiating apoptosis exclusively via the DR5 receptor
Proceedings of the National Academy of Sciences, 2006
Comparative Genomics and Disorder Prediction Identify Biologically Relevant SH3 Protein Interactions
PLoS Computational Biology, 2005
Probabilistic model of the human protein-protein interaction network
Nature Biotechnology, 2005
Development of Human Protein Reference Database as an Initial Platform for Approaching Systems Biology in Humans
Genome Research, 2003
Structural and Thermodynamic Basis for the Interaction of the Src SH2 Domain with the Activated Form of the PDGF β-receptor
Journal of Molecular Biology, 2003
Multisite phosphorylation of a CDK inhibitor sets a threshold for the onset of DNA replication
Nature, 2001
SH2 domains recognize specific phosphopeptide sequences
Published by Elsevier ,1993