Correlated mutations: Advances and limitations. A study on fusion proteins and on the Cohesin‐Dockerin families

28 February 2006

journal article
research article
Published by Wiley in Proteins-Structure Function and Bioinformatics

Vol. 63 (4) , 832-845
https://doi.org/10.1002/prot.20933

Abstract

Correlated mutations have been repeatedly exploited for intramolecular contact map prediction. Over the last decade these efforts yielded several methods for measuring correlated mutations. Nevertheless, the application of correlated mutations for the prediction of intermolecular interactions has not yet been explored. This gap is due to several obstacles, such as 3D complexes availability, paralog discrimination, and the availability of sequence pairs that are required for inter‐ but not intramolecular analyses. Here we selected for analysis fusion protein families that bypass some of these obstacles. We find that several correlated mutation measurements yield reasonable accuracy for intramolecular contact map prediction on the fusion dataset. However, the accuracy level drops sharply in intermolecular contacts prediction. This drop in accuracy does not occur always. In the Cohesin‐Dockerin family, reasonable accuracy is achieved in the prediction of both intra‐ and intermolecular contacts. The Cohesin‐Dockerin family is well suited for correlated mutation analysis. Because, however, this family constitutes a special case (it has radical mutations, has domain repeats, within each species each Dockerin domain interacts with each Cohesin domain, see below), the successful prediction in this family does not point to a general potential in using correlated mutations for predicting intermolecular contacts. Overall, the results of our study indicate that current methodologies of correlated mutations analysis are not suitable for large‐scale intermolecular contact prediction, and thus cannot assist in docking. With current measurements, sequence availability, sequence annotations, and underdeveloped sequence pairing methods, correlated mutations can yield reasonable accuracy only for a handful of families. Proteins 2006.

Keywords

This publication has 67 references indexed in Scilit:

An Evolutionarily Conserved Network of Amino Acids Mediates Gating in Voltage-dependent Potassium Channels
Journal of Molecular Biology, 2004
Influence of conservation on calculations of amino acid covariance in multiple sequence alignments
Proteins-Structure Function and Bioinformatics, 2004
MUSCLE: multiple sequence alignment with high accuracy and high throughput
Nucleic Acids Research, 2004
The Pfam protein families database
Nucleic Acids Research, 2004
Evolutionarily conserved networks of residues mediate allosteric communication in proteins
Nature Structural & Molecular Biology, 2002
Mapping pathways of allosteric communication in GroEL by analysis of correlated mutations
Proteins-Structure Function and Bioinformatics, 2002
A Possible Extension of Shannon's Information Theory
Entropy, 2001
Analysis of covariation in an SH3 domain sequence alignment: applications in tertiary contact prediction and the design of compensating hydrophobic core substitutions
Journal of Molecular Biology, 2000
An Evolutionary Trace Method Defines Binding Surfaces Common to Protein Families
Journal of Molecular Biology, 1996
CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice
Nucleic Acids Research, 1994