An alternative model of amino acid replacement
Open Access
- 5 November 2004
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 21 (7) , 975-980
- https://doi.org/10.1093/bioinformatics/bti109
Abstract
Motivation: The observed correlations between pairs of homologous protein sequences are typically explained in terms of a Markovian dynamic of amino acid substitution. This model assumes that every location on the protein sequence has the same background distribution of amino acids, an assumption that is incompatible with the observed heterogeneity of protein amino acid profiles and with the success of profile multiple sequence alignment. Results: We propose an alternative model of amino acid replacement during protein evolution based upon the assumption that the variation of the amino acid background distribution from one residue to the next is sufficient to explain the observed sequence correlations of homologs. The resulting dynamical model of independent replacements drawn from heterogeneous backgrounds is simple and consistent, and provides a unified homology match score for sequence–sequence, sequence–profile and profile–profile alignment. Contact:gec@compbio.berkeley.eduKeywords
All Related Versions
This publication has 33 references indexed in Scilit:
- Amino acid substitution matrices from an information theoretic perspectivePublished by Elsevier ,2005
- Identification of common molecular subsequencesPublished by Elsevier ,2004
- WebLogo: A Sequence Logo Generator: Figure 1Genome Research, 2004
- A comparison of scoring functions for protein sequence profile alignmentBioinformatics, 2004
- Within the twilight zone: a sensitive profile-profile comparison tool based on information theoryJournal of Molecular Biology, 2002
- Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methodsJournal of Molecular Biology, 1998
- Using Evolutionary Trees in Protein Secondary Structure Prediction and Other Comparative Sequence AnalysesJournal of Molecular Biology, 1996
- The rapid generation of mutation data matrices from protein sequencesBioinformatics, 1992
- Sequence logos: a new way to display consensus sequencesNucleic Acids Research, 1990
- Evolutionary trees from DNA sequences: A maximum likelihood approachJournal of Molecular Evolution, 1981