Deep Sequencing of a Genetically Heterogeneous Sample: Local Haplotype Reconstruction and Read Error Correction
- 1 March 2010
- journal article
- research article
- Published by Mary Ann Liebert Inc in Journal of Computational Biology
- Vol. 17 (3) , 417-428
- https://doi.org/10.1089/cmb.2009.0164
Abstract
We present a computational method for analyzing deep sequencing data obtained from a genetically diverse sample. The set of reads obtained from a deep sequencing experiment represents a statistical sample of the underlying population. We develop a generative probabilistic model for assigning observed reads to unobserved haplotypes in the presence of sequencing errors. This clustering problem is solved in a Bayesian fashion using the Dirichlet process mixture to define a prior distribution on the unknown number of haplotypes in the mixture. We devise a Gibbs sampler for sampling from the joint posterior distribution of haplotype sequences, assignment of reads to haplotypes, and error rate of the sequencing process, to obtain estimates of the local haplotype structure of the population. The method is evaluated on simulated data and on experimental deep sequencing data obtained from HIV samples.Keywords
This publication has 19 references indexed in Scilit:
- MetaSim—A Sequencing Simulator for Genomics and MetagenomicsPLOS ONE, 2008
- Subclonal phylogenetic structures in cancer revealed by ultra-deep sequencingProceedings of the National Academy of Sciences, 2008
- Viral Population Estimation Using PyrosequencingPLoS Computational Biology, 2008
- Bioinformatics challenges of new sequencing technologyPublished by Elsevier ,2008
- The impact of next-generation sequencing technology on geneticsTrends in Genetics, 2008
- The year of sequencingNature Methods, 2008
- DNA bar coding and pyrosequencing to identify rare HIV drug resistance mutationsNucleic Acids Research, 2007
- Markov Chain Sampling Methods for Dirichlet Process Mixture ModelsJournal of Computational and Graphical Statistics, 2000
- HIV Treatment Failure: Testing for HIV Resistance in Clinical PracticeScience, 1998
- Antigenic Diversity Thresholds and the Development of AIDSScience, 1991