ISIS: interaction sites identified from sequence
Open Access
- 15 January 2007
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 23 (2) , e13-e16
- https://doi.org/10.1093/bioinformatics/btl303
Abstract
Motivation: Large-scale experiments reveal pairs of interacting proteins but leave the residues involved in the interactions unknown. These interface residues are essential for understanding the mechanism of interaction and are often desired drug targets. Reliable identification of residues that reside in protein–protein interface typically requires analysis of protein structure. Therefore, for the vast majority of proteins, for which there is no high-resolution structure, there is no effective way of identifying interface residues. Results: Here we present a machine learning-based method that identifies interacting residues from sequence alone. Although the method is developed using transient protein–protein interfaces from complexes of experimentally known 3D structures, it never explicitly uses 3D information. Instead, we combine predicted structural features with evolutionary information. The strongest predictions of the method reached over 90% accuracy in a cross-validation experiment. Our results suggest that despite the significant diversity in the nature of protein–protein interactions, they all share common basic principles and that these principles are identifiable from sequence alone. Contact:yanay.ofran@columbia.eduKeywords
This publication has 35 references indexed in Scilit:
- The Protein Data BankActa Crystallographica Section D-Biological Crystallography, 2002
- Interrogating protein interaction networks through structural biologyProceedings of the National Academy of Sciences, 2002
- Prediction of protein–protein interaction sites in heterocomplexes with neural networksEuropean Journal of Biochemistry, 2002
- ConSurf: an algorithmic tool for the identification of functional regions in proteins by surface mapping of phylogenetic informationJournal of Molecular Biology, 2001
- The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000Nucleic Acids Research, 2000
- The Protein Data BankNucleic Acids Research, 2000
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- A novel approach to prediction of the 3‐dimensional structures of protein backbones by neural networksFEBS Letters, 1990
- Protein secondary structure and homology by neural networks The α‐helices in rhodopsinFEBS Letters, 1988
- The protein data bank: A computer-based archival file for macromolecular structuresJournal of Molecular Biology, 1977