Characterization and prediction of residues determining protein functional specificity
Open Access
- 1 May 2008
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 24 (13) , 1473-1480
- https://doi.org/10.1093/bioinformatics/btn214
Abstract
Motivation: Within a homologous protein family, proteins may be grouped into subtypes that share specific functions that are not common to the entire family. Often, the amino acids present in a small number of sequence positions determine each protein's particular function-al specificity. Knowledge of these specificity determining positions (SDPs) aids in protein function prediction, drug design and experimental analysis. A number of sequence-based computational methods have been introduced for identifying SDPs; however, their further development and evaluation have been hindered by the limited number of known experimentally determined SDPs. Results: We combine several bioinformatics resources to automate a process, typically undertaken manually, to build a dataset of SDPs. The resulting large dataset, which consists of SDPs in enzymes, enables us to characterize SDPs in terms of their physicochemical and evolution-ary properties. It also facilitates the large-scale evaluation of sequence-based SDP prediction methods. We present a simple sequence-based SDP prediction method, GroupSim, and show that, surprisingly, it is competitive with a representative set of current methods. We also describe ConsWin, a heuristic that considers sequence conservation of neighboring amino acids, and demonstrate that it improves the performance of all methods tested on our large dataset of enzyme SDPs. Availability: Datasets and GroupSim code are available online at http://compbio.cs.princeton.edu/specificity/ Contact:msingh@cs.princeton.edu Supplementary information: Supplementary data are available at Bioinformatics online.Keywords
This publication has 40 references indexed in Scilit:
- Functional Specificity Lies within the Properties and Evolutionary Changes of Amino AcidsJournal of Molecular Biology, 2007
- Automated Protein Subfamily Identification and ClassificationPLoS Computational Biology, 2007
- Predicting functionally important residues from sequence conservationBioinformatics, 2007
- A gold standard set of mechanistically diverse enzyme superfamiliesGenome Biology, 2006
- The Universal Protein Resource (UniProt)Nucleic Acids Research, 2004
- Analysis of Catalytic Residues in Enzyme Active SitesJournal of Molecular Biology, 2002
- The ENZYME database in 2000Nucleic Acids Research, 2000
- The Protein Data BankNucleic Acids Research, 2000
- A method to predict functional residues in proteinsNature Structural & Molecular Biology, 1995
- Basic local alignment search toolJournal of Molecular Biology, 1990