Direct-coupling analysis of residue coevolution captures native contacts across many protein families
Top Cited Papers
- 21 November 2011
- journal article
- research article
- Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences
- Vol. 108 (49) , E1293-E1301
- https://doi.org/10.1073/pnas.1111471108
Abstract
The similarity in the three-dimensional structures of homologous proteins imposes strong constraints on their sequence variability. It has long been suggested that the resulting correlations among amino acid compositions at different sequence positions can be exploited to infer spatial contacts within the tertiary protein structure. Crucial to this inference is the ability to disentangle direct and indirect correlations, as accomplished by the recently introduced direct-coupling analysis (DCA). Here we develop a computationally efficient implementation of DCA, which allows us to evaluate the accuracy of contact prediction by DCA for a large number of protein domains, based purely on sequence information. DCA is shown to yield a large number of correctly predicted contacts, recapitulating the global structure of the contact map for the majority of the protein domains examined. Furthermore, our analysis captures clear signals beyond intradomain residue contacts, arising, e.g., from alternative protein conformations, ligand-mediated residue couplings, and interdomain interactions in protein oligomers. Our findings suggest that contacts predicted by DCA can be used as a reliable guide to facilitate computational predictions of alternative protein conformations, protein complex formation, and even the de novo prediction of protein domain structures, contingent on the existence of a large number of homologous sequences which are being rapidly made available due to advances in genome sequencing.Keywords
All Related Versions
This publication has 53 references indexed in Scilit:
- High-resolution protein complexes from integrating genomic information with molecular simulationProceedings of the National Academy of Sciences, 2009
- The Genomes On Line Database (GOLD) in 2009: status of genomic and metagenomic projects and their associated metadataNucleic Acids Research, 2009
- The MiST2 database: a comprehensive genomics resource on microbial signal transductionNucleic Acids Research, 2009
- Protein Sectors: Evolutionary Units of Three-Dimensional StructurePublished by Elsevier ,2009
- Identification of direct residue contacts in protein–protein interaction by message passingProceedings of the National Academy of Sciences, 2009
- Rewiring the Specificity of Two-Component Signal Transduction SystemsCell, 2008
- Crystal Structures of the Response Regulator DosR from Mycobacterium tuberculosis Suggest a Helix Rearrangement Mechanism for Phosphorylation ActivationJournal of Molecular Biology, 2008
- Influence of conservation on calculations of amino acid covariance in multiple sequence alignmentsProteins-Structure Function and Bioinformatics, 2004
- The Protein Data BankNucleic Acids Research, 2000
- Solution of 'Solvable model of a spin glass'Philosophical Magazine, 1977