Usefulness of Single Nucleotide Polymorphism Data for Estimating Population Parameters
Open Access
- 1 September 2000
- journal article
- research article
- Published by Oxford University Press (OUP) in Genetics
- Vol. 156 (1) , 439-447
- https://doi.org/10.1093/genetics/156.1.439
Abstract
Single nucleotide polymorphism (SNP) data can be used for parameter estimation via maximum likelihood methods as long as the way in which the SNPs were determined is known, so that an appropriate likelihood formula can be constructed. We present such likelihoods for several sampling methods. As a test of these approaches, we consider use of SNPs to estimate the parameter Θ = 4Neμ (the scaled product of effective population size and per-site mutation rate), which is related to the branch lengths of the reconstructed genealogy. With infinite amounts of data, ML models using SNP data are expected to produce consistent estimates of Θ. With finite amounts of data the estimates are accurate when Θ is high, but tend to be biased upward when Θ is low. If recombination is present and not allowed for in the analysis, the results are additionally biased upward, but this effect can be removed by incorporating recombination into the analysis. SNPs defined as sites that are polymorphic in the actual sample under consideration (sample SNPs) are somewhat more accurate for estimation of Θ than SNPs defined by their polymorphism in a panel chosen from the same population (panel SNPs). Misrepresenting panel SNPs as sample SNPs leads to large errors in the maximum likelihood estimate of Θ. Researchers collecting SNPs should collect and preserve information about the method of ascertainment so that the data can be accurately analyzed.Keywords
This publication has 14 references indexed in Scilit:
- First International SNP Meeting at Skokloster, Sweden, August 1998. Enthusiasm mixed with scepticism about single-nucleotide polymorphism markers for dissecting complex disordersEuropean Journal of Human Genetics, 1999
- Maximum Likelihood Estimation of Population Growth Rates Based on the CoalescentGenetics, 1998
- Full reconstruction of Markov models on evolutionary trees: Identifiability and consistencyMathematical Biosciences, 1996
- Sampling theory for neutral alleles in a varying environmentPhilosophical Transactions Of The Royal Society B-Biological Sciences, 1994
- Evolutionary trees from DNA sequences: A maximum likelihood approachJournal of Molecular Evolution, 1981
- Estimation of genetic variation at the DNA level from restriction endonuclease data.Proceedings of the National Academy of Sciences, 1981
- A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequencesJournal of Molecular Evolution, 1980
- HETEROSIS OR NEUTRALITY1977
- On the number of segregating sites in genetical models without recombinationTheoretical Population Biology, 1975
- Evolution of Protein MoleculesPublished by Elsevier ,1969