Analysis of High-Throughput Ancient DNA Sequencing Data
- 8 December 2011
- book chapter
- Published by Springer Nature
- Vol. 840, 197-228
- https://doi.org/10.1007/978-1-61779-516-9_23
Abstract
Advances in sequencing technologies have dramatically changed the field of ancient DNA (aDNA). It is now possible to generate an enormous quantity of aDNA sequence data both rapidly and inexpensively. As aDNA sequences are generally short in length, damaged, and at low copy number relative to coextracted environmental DNA, high-throughput approaches offer a tremendous advantage over traditional sequencing approaches in that they enable a complete characterization of an aDNA extract. However, the particular qualities of aDNA also present specific limitations that require careful consideration in data analysis. For example, results of high-throughout analyses of aDNA libraries may include chimeric sequences, sequencing error and artifacts, damage, and alignment ambiguities due to the short read lengths. Here, I describe typical primary data analysis workflows for high-throughput aDNA sequencing experiments, including (1) separation of individual samples in multiplex experiments; (2) removal of protocol-specific library artifacts; (3) trimming adapter sequences and merging paired-end sequencing data; (4) base quality score filtering or quality score propagation during data analysis; (5) identification of endogenous molecules from an environmental background; (6) quantification of contamination from other DNA sources; and (7) removal of clonal amplification products or the compilation of a consensus from clonal amplification products, and their exploitation for estimation of library complexity.Keywords
This publication has 72 references indexed in Scilit:
- Ancient human genome sequence of an extinct Palaeo-EskimoNature, 2010
- The Neandertal genome and ancient DNA authenticityThe EMBO Journal, 2009
- How to map billions of short reads onto genomesNature Biotechnology, 2009
- Accurate whole human genome sequencing using reversible terminator chemistryNature, 2008
- Sequencing the nuclear genome of the extinct woolly mammothNature, 2008
- Next-generation DNA sequencingNature Biotechnology, 2008
- A Complete Neandertal Mitochondrial Genome Sequence Determined by High-Throughput SequencingCell, 2008
- Alta-Cyclic: a self-optimizing base caller for next-generation sequencingNature Methods, 2008
- BLAT—The BLAST-Like Alignment ToolGenome Research, 2002
- T-coffee: a novel method for fast and accurate multiple sequence alignment 1 1Edited by J. ThorntonJournal of Molecular Biology, 2000