Evolutionary trees from nucleic acid and protein sequences
- 23 December 1985
- journal article
- research article
- Published by The Royal Society in Proceedings of the Royal Society of London. B. Biological Sciences
- Vol. 226 (1244) , 271-302
- https://doi.org/10.1098/rspb.1985.0096
Abstract
The problem addressed is that of estimating evolutionary relationship by the comarative study of the nucleic acid or protein sequences of living organisms. The most important point made in this acocunt is that estimation of evolutionary relationship should be based on clearly defined models the assumptions of which are open to test. The models should as far as possible conform to what is known about the processes of evolutionary change in the organisms concerned. Prevailing approaches, grouped here as divergence models, are stated below in such a way that it is clear that they involve unrealistic assumptions about the the nature of evolutionary change. Emphasis is placed on the use of probabilistic models of evolutionary change. The historical development of these models has proceeded in parall with the more commonly used ''parsimony'' methods. The problem of reconstructing phylogenies is simplified by assuming that the pathways of genetic transmission conform to a tree structure. The tree model is justified on the grounds that such pathways may be traced in a genealogy, however, the tree model ignores hybridization and horizontal transmission of the genetic material. The other essential component is a probabilistic formulation of the processes of genetic change. Consideration of genetic reliability (a view of mutation as failure correctly to copy information) leads to such a probabilistic description. Several proposed schemes which make numerical assessment of the relative frequencies of base substitution in DNA are considered. We next examine methods for the estimation of phylogenetic trees on the basis of probabilistic models. Pairwise estimates of divergence times lead rapidly to hypotheses of evolutionary relationship, but it is stressed that joint estimation procedures, which simultaneously take account of all the data, lead to more complete estimates of relationship. The various methods are illustrated as applied to the analysis of nucleic acid sequence data from the mammalian mitochondrial genome. Finally, we discuss weaknesses of the current stochastic models and point out ways in which accumulating experimental information may lead to their refinement or refutation.This publication has 42 references indexed in Scilit:
- Identification of the polypeptides encoded in the ATPase 6 gene and in the unassigned reading frames 1 and 3 of human mtDNA.Proceedings of the National Academy of Sciences, 1983
- Molecular drive: a cohesive mode of species evolutionNature, 1982
- Novel features of animal mtDNA evolution as shown by sequences of two rat cytochrome oxidase subunit II genes.Proceedings of the National Academy of Sciences, 1982
- Sequence and gene organization of mouse mitochondrial DNACell, 1981
- Sequence and organization of the human mitochondrial genomeNature, 1981
- DNA methylation and the frequency of CpG in animal DNANucleic Acids Research, 1980
- Molecular basis of base substitution hotspots in Escherichia coliNature, 1978
- A SIMPLE TEST FOR THE POSSIBLE SIMULTANEOUS EVOLUTIONARY DIVERGENCE OF TWO AMINO ACID POSITIONSTaxon, 1975
- A molecular sequence metric and evolutionary treesMathematical Biosciences, 1974
- A METHOD FOR DEDUCING BRANCHING SEQUENCES IN PHYLOGENYEvolution, 1965