Using multiple sequence correlation analysis to characterize functionally important protein regions
Open Access
- 1 June 2003
- journal article
- research article
- Published by Oxford University Press (OUP) in Protein Engineering, Design and Selection
- Vol. 16 (6) , 397-406
- https://doi.org/10.1093/protein/gzg053
Abstract
Protein co‐evolution under structural and functional constraints necessitates the preservation of important interactions. Identifying functionally important regions poses many obstacles in protein engineering efforts. In this paper, we present a bioinformatics‐inspired approach (residue correlation analysis, RCA) for predicting functionally important domains from protein family sequence data. RCA is comprised of two major steps: (i) identifying pairs of residue positions that mutate in a coordinated manner, and (ii) using these results to identify protein regions that interact with an uncommonly high number of other residues. We hypothesize that strongly correlated pairs result not only from contacting pairs, but also from residues that participate in conformational changes involved during catalysis or important interactions necessary for retaining functionality. The results show that highly mobile loops that assist in ligand association/dissociation tend to exhibit high correlation. RCA results exhibit good agreement with the findings of experimental and molecular dynamics studies for the three protein families that are analyzed: (i) DHFR (dihydrofolate reductase), (ii) cyclophilin, and (iii) formyl‐transferase. Specifically, the specificity (percentage of correct predictions) in all three cases is substantially higher than those obtained by entropic measures or contacting residue pairs. In addition, we use our approach in a predictive fashion to identify important regions of a transmembrane amino acid transporter protein for which there is limited structural and functional information available.Keywords
This publication has 12 references indexed in Scilit:
- Identifying residue–residue clashes in protein hybrids by using a second-order mean-field approachProceedings of the National Academy of Sciences, 2003
- Protein building blocks preserved by recombinationNature Structural & Molecular Biology, 2002
- Detecting Compensatory Covariation Signals in Protein Evolution Using Reconstructed Ancestral SequencesJournal of Molecular Biology, 2002
- Automated structure-based prediction of functional sites in proteins: applications to assessing the validity of inheriting protein function from homology in genome annotation and to protein dockingJournal of Molecular Biology, 2001
- Backbone Dynamics in Dihydrofolate Reductase Complexes: Role of Loop Flexibility in the Catalytic MechanismBiochemistry, 2001
- In-vitro Selection of Highly Stabilized Protein Variants with Optimized SurfaceJournal of Molecular Biology, 2001
- ConSurf: an algorithmic tool for the identification of functional regions in proteins by surface mapping of phylogenetic informationJournal of Molecular Biology, 2001
- Interloop Contacts Modulate Ligand Cycling during Catalysis byEscherichia coliDihydrofolate ReductaseBiochemistry, 2001
- How frequent are correlated changes in families of protein sequences?Proceedings of the National Academy of Sciences, 1994
- Compensating changes in protein multiple sequence alignmentsProtein Engineering, Design and Selection, 1994