The random character of protein evolution and its effects on the reliability of phylogenetic information deduced from amino acid sequences and compositions

1 November 1980

journal article
research article
Published by Portland Press Ltd. in Biochemical Journal

Vol. 191 (2) , 349-354
https://doi.org/10.1042/bj1910349

Abstract

Because evolution occurs by random events, the actual number of substitutions that occur in any period is not exactly equal to the number expected from the mean rate of substitution, but is statistically distributed about it. In consequence, even if rates of evolution are constant in different lineages, ‘trees’ deduced from descendant protein sequences contain random errors. When there are fewer than about eight differences between the sequences of the most distantly related pair from a set of proteins, this random effect is very large. It can then render trivial the statistical disadvantage inherent in using a crude measure of protein difference, such as amino acid composition or immunological cross-reactivity, in preference to a measure based the sequences of the most distantly related pair from a set of proteins, this random effect is very large. It can then render trivial the statistical disadvantage inherent in using a crude measure of protein difference, such as amino acid composition or immunological cross-reactivity, in preference to a measure based the sequences of the most distantly related pair from a set of proteins, this random effect is very large. It can then render trivial the statistical disadvantage inherent in using a crude measure of protein difference, such as amino acid composition or immunological cross-reactivity, in preference to a measure based on amino acid sequence. In some cases, such as classification of mammals on the basis of cytochrome c structure, it appears to make little difference to the reliability of the results whether the sequences of the protein concerned are known or not. It may also be possible to obtain more reliable phylogenetic information from composition measurements on several kinds of protein than one could obtain from sequence measurements on a single kind of protein.

This publication has 9 references indexed in Scilit:

Genetic Data and the Listing of Species Under the U.S. Endangered Species Act
Conservation Biology, 2007
How reliably do amino acid composition comparisons predict sequence similarities between proteins?
Journal of Theoretical Biology, 1979
Heterogeneity of amino acid sequence in hippopotamus cytochrome c.
Journal of Biological Chemistry, 1978
Construction of phylogenetic trees for proteins and nucleic acids: Empirical evaluation of alternative matrix methods
Journal of Molecular Evolution, 1978
Standard error of immunological dating of evolutionary time
Journal of Molecular Evolution, 1977
Assessment of protein sequence identity from amino acid composition data
Journal of Theoretical Biology, 1977
The origin and evolution of protein superfamilies.
1976
An examination of the constancy of the rate of molecular evolution
Journal of Molecular Evolution, 1974
Multiple genes for lysozyme in birds
Archives of Biochemistry and Biophysics, 1970