Phylogenies from Gene Frequencies: A Statistical Problem
- 1 September 1985
- journal article
- research article
- Published by JSTOR in Systematic Zoology
- Vol. 34 (3) , 300-311
- https://doi.org/10.2307/2413149
Abstract
Inferring phylogenies from gene frequencies should be regarded as a statistical problem rather than treated in the framework of the hypothetico-deductive method. The approximations involved in a statistical treatment are discussed, as are the models for gene frequency change implicit in the use of statistical methods. In particular, the different genetic distance statistics have different assumptions. These are pointed out, and a Markov chain treatment of genetic drift in a small population is used to evaluate the behavior of a number of the most popular genetic distances. Distance measures that are standardized to correct for the effect of initial gene frequencies behave as expected, but only for moderate amounts of time and when initial gene frequencies are not extreme. The distance measures such as those by Balakrishnan and Sanghvi and by Cavalli-Sforza and Edwards, which are standardized for the effect of the initial gene frequencies, perform acceptably if the initial gene frequencies are not too extreme and the divergence time is not larger than twice the effective population size. A genetic distance based on discrete character coding of alleles according to their presence or absence shows quite unusual behavior. No genetic distance crops very well with extreme initial gene frequencies. The prospect of getting useable additional information from population samples of sequences is also discussed. Criticisms by J.S. Farris of an earlier paper that used statistical criteria to evaluate parsimony and compatibility methods are rebutted.This publication has 0 references indexed in Scilit: