Detecting coevolving amino acid sites using Bayesian mutational mapping
Open Access
- 1 June 2005
- journal article
- conference paper
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 21 (Suppl 1) , i126-i135
- https://doi.org/10.1093/bioinformatics/bti1032
Abstract
Motivation: The evolution of protein sequences is constrained by complex interactions between amino acid residues. Because harmful substitutions may be compensated for by other substitutions at neighboring sites, residues can coevolve. We describe a Bayesian phylogenetic approach to the detection of coevolving residues in protein families. This method, Bayesian mutational mapping (BMM), assigns mutations to the branches of the evolutionary tree stochastically, and then test statistics are calculated to determine whether a coevolutionary signal exists in the mapping. Posterior predictive P-values provide an estimate of significance, and specificity is maintained by integrating over uncertainty in the estimation of the tree topology, branch lengths and substitution rates. A coevolutionary Markov model for codon substitution is also described, and this model is used as the basis of several test statistics. Results: Results on simulated coevolutionary data indicate that the BMM method can successfully detect nearly all coevolving sites when the model has been correctly specified, and that non-parametric statistics such as mutual information are generally less powerful than parametric statistics. On a dataset of eukaryotic proteins from the phosphoglycerate kinase (PGK) family, interdomain site contacts yield a significantly greater coevolutionary signal than interdomain non-contacts, an indication that the method provides information about interacting sites. Failure to account for the heterogeneity in rates across sites in PGK resulted in a less discriminating test, yielding a marked increase in the number of reported positives at both contact and non-contact sites. Contact:matt@dimmic.net Supplementary information:http://www.dimmic.net/supplement/Keywords
This publication has 9 references indexed in Scilit:
- Modeling the site-specific variation of selection patterns along lineagesProceedings of the National Academy of Sciences, 2004
- Bayesian Estimation of Positively Selected SitesJournal of Molecular Evolution, 2004
- Protein contact prediction using patterns of correlationProteins-Structure Function and Bioinformatics, 2004
- Frequent Inconsistency of Parsimony Under a Simple Model of CladogenesisSystematic Biology, 2003
- Systematic Variation of Amino Acid Substitutions for Stringent Assessment of Pairwise CovariationJournal of Molecular Biology, 2003
- Using multiple interdependency to separate functional from phylogenetic correlations in protein alignmentsBioinformatics, 2003
- The Constraints Protein–Protein Interactions Place on Sequence DivergenceJournal of Molecular Biology, 2002
- Mapping Mutations on PhylogeniesSystematic Biology, 2002
- Detecting Compensatory Covariation Signals in Protein Evolution Using Reconstructed Ancestral SequencesJournal of Molecular Biology, 2002