Coalescent Simulations and Statistical Tests of Neutrality

Abstract
Recently, Depaulis and Veuille (1998) proposed two new test statistics based on the number and frequency of different haplotypes. In a companion letter, Markovtsova, Marjoram, and Tavaré (2001) point out that Depaulis and Veuille (1998) do not use the standard implementation for their coalescent simulations. Standard coalescent simulations first produce random genealogies, then place mutations at constant rate θ/2 (θ = 4Nμ is the population mutation parameter, where N is the effective population size and μ is the per-locus mutation rate per generation) along each of the branches (Kingman 1982a, 1982b ; Hudson 1990 ). Instead, Depaulis and Veuille (1998) generate distributions for their statistics by first constructing random genealogies, then placing S (the observed number of segregating sites) mutations on each tree. This “fixed S” method has been used before (e.g., Hudson 1993 ; Rozas and Rozas 1999 ), partly because it is easy to simulate, and partly because S is observed, while θ must be estimated from the data (see, e.g., Fu 1996 ). In fact, it is not clear how to estimate θ independent of polymorphism data. Although the fixed S scheme does not directly use θ, Markovtsova, Marjoram, and Tavaré (2001) highlight that the actual distributions of test statistics conditional on S are not independent of θ. In particular, knowing both θ and S changes the expected shape of a genealogy. For example, if S is unusually large given θ, we expect the genealogy to be longer than average. Thus, the critical values in Depaulis and Veuille (1998) might not be appropriate, since the actual rejection probabilities for their tests are functions of the unknown parameter θ.

This publication has 8 references indexed in Scilit: