Accurate and fast methods to estimate the population mutation rate from error prone sequences

Open Access

11 August 2009

journal article
research article
Published by Springer Nature in BMC Bioinformatics

Vol. 10 (1) , 247
https://doi.org/10.1186/1471-2105-10-247

Abstract

The population mutation rate (θ) remains one of the most fundamental parameters in genetics, ecology, and evolutionary biology. However, its accurate estimation can be seriously compromised when working with error prone data such as expressed sequence tags, low coverage draft sequences, and other such unfinished products. This study is premised on the simple idea that a random sequence error due to a chance accident during data collection or recording will be distributed within a population dataset as a singleton (i.e., as a polymorphic site where one sampled sequence exhibits a unique base relative to the common nucleotide of the others). Thus, one can avoid these random errors by ignoring the singletons within a dataset.

This publication has 52 references indexed in Scilit:

Estimation of Nucleotide Diversity, Disequilibrium Coefficients, and Mutation Rates from High-Coverage Genome-Sequencing Projects
Molecular Biology and Evolution, 2008
Subdivision in an Ancestral Species Creates Asymmetry in Gene Trees
Molecular Biology and Evolution, 2008
Testing for Neutrality in Samples With Sequencing Errors
Genetics, 2008
Determination of Mitochondrial Genetic Diversity in Mammals
Genetics, 2008
Accounting for Bias from Sequencing Error in Population Genetic Estimates
Molecular Biology and Evolution, 2007
Incorporating Experimental Design and Error Into Coalescent/Mutation Models of Population History
Genetics, 2007
Accuracy and quality of massively parallel DNA pyrosequencing
Genome Biology, 2007
Estimation of Population Heterozygosity and Library Construction-Induced Mutation Rate From Expressed Sequence Tag Collections
Genetics, 2007
Inference of population genetic parameters in metagenomics: A clean look at messy data
Genome Research, 2006
Unrooted genealogical tree probabilities in the infinitely-many-sites model
Mathematical Biosciences, 1995