Using genome-wide measurements for computational prediction of SH2–peptide interactions
Open Access
- 5 June 2009
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 37 (14) , 4629-4641
- https://doi.org/10.1093/nar/gkp394
Abstract
Peptide-recognition modules (PRMs) are used throughout biology to mediate protein–protein interactions, and many PRMs are members of large protein domain families. Recent genome-wide measurements describe networks of peptide–PRM interactions. In these networks, very similar PRMs recognize distinct sets of peptides, raising the question of how peptide-recognition specificity is achieved using similar protein domains. The analysis of individual protein complex structures often gives answers that are not easily applicable to other members of the same PRM family. Bioinformatics-based approaches, one the other hand, may be difficult to interpret physically. Here we integrate structural information with a large, quantitative data set of SH2 domain–peptide interactions to study the physical origin of domain–peptide specificity. We develop an energy model, inspired by protein folding, based on interactions between the amino-acid positions in the domain and peptide. We use this model to successfully predict which SH2 domains and peptides interact and uncover the positions in each that are important for specificity. The energy model is general enough that it can be applied to other members of the SH2 family or to new peptides, and the cross-validation results suggest that these energy calculations will be useful for predicting binding interactions. It can also be adapted to study other PRM families, predict optimal peptides for a given SH2 domain, or study other biological interactions, e.g. protein–DNA interactions.Keywords
This publication has 60 references indexed in Scilit:
- Predicting PDZ domain–peptide interactions from primary sequencesNature Biotechnology, 2008
- Genome-Wide Prediction of SH2 Domain Targets Using Structural Information and the FoldX AlgorithmPLoS Computational Biology, 2008
- Phospho.ELM: a database of phosphorylation sites update 2008Nucleic Acids Research, 2007
- PDZ Domain Binding Selectivity Is Optimized Across the Mouse ProteomeScience, 2007
- Energetics of protein–DNA interactionsNucleic Acids Research, 2007
- Precise physical models of protein–DNA interaction from high-throughput dataProceedings of the National Academy of Sciences, 2007
- Ab Initio Prediction of Transcription Factor Targets Using Structural KnowledgePLoS Computational Biology, 2005
- MUSCLE: multiple sequence alignment with high accuracy and high throughputNucleic Acids Research, 2004
- The Protein Data BankNucleic Acids Research, 2000
- SH2 domains recognize specific phosphopeptide sequencesPublished by Elsevier ,1993