A regression framework incorporating quantitative and negative interaction data improves quantitative prediction of PDZ domain–peptide interaction from primary sequence
Open Access
- 2 December 2010
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 27 (3) , 383-390
- https://doi.org/10.1093/bioinformatics/btq657
Abstract
Motivation: Predicting protein interactions involving peptide recognition domains is essential for understanding the many important biological processes they mediate. It is important to consider the binding strength of these interactions to help us construct more biologically relevant protein interaction networks that consider cellular context and competition between potential binders. Results: We developed a novel regression framework that considers both positive (quantitative) and negative (qualitative) interaction data available for mouse PDZ domains to quantitatively predict interactions between PDZ domains, a large peptide recognition domain family, and their peptide ligands using primary sequence information. First, we show that it is possible to learn from existing quantitative and negative interaction data to infer the relative binding strength of interactions involving previously unseen PDZ domains and/or peptides given their primary sequence. Performance was measured using cross-validated hold out testing and testing with previously unseen PDZ domain–peptide interactions. Second, we find that incorporating negative data improves quantitative interaction prediction. Third, we show that sequence similarity is an important prediction performance determinant, which suggests that experimentally collecting additional quantitative interaction data for underrepresented PDZ domain subfamilies will improve prediction. Availability and Implementation: The Matlab code for our SemiSVR predictor and all data used here are available at http://baderlab.org/Data/PDZAffinity. Contact:gary.bader@utoronto.ca; dengnaiyang@cau.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online.Keywords
This publication has 41 references indexed in Scilit:
- Protein Sectors: Evolutionary Units of Three-Dimensional StructurePublished by Elsevier ,2009
- Using genome-wide measurements for computational prediction of SH2–peptide interactionsNucleic Acids Research, 2009
- High‐energy water sites determine peptide binding affinity and specificity of PDZ domainsProtein Science, 2009
- Pan-specific MHC class I predictors: a benchmark of HLA class I pan-specific prediction methodsBioinformatics, 2008
- Activity motifs reveal principles of timing in transcriptional control of the yeast metabolic networkNature Biotechnology, 2008
- The Relative Binding Affinities of PDZ Partners for CFTR: A Biochemical Basis for Efficient Endocytic RecyclingBiochemistry, 2008
- Predicting PDZ domain–peptide interactions from primary sequencesNature Biotechnology, 2008
- A second generation human haplotype map of over 3.1 million SNPsNature, 2007
- PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequenceNucleic Acids Research, 2006
- Uncovering Quantitative Protein Interaction Networks for Mouse PDZ Domains Using Protein MicroarraysJournal of the American Chemical Society, 2006