RazerS 3: Faster, fully sensitive read mapping

Top Cited Papers

Open Access

24 August 2012

journal article
research article
Published by Oxford University Press (OUP) in Bioinformatics

Vol. 28 (20) , 2592-2599
https://doi.org/10.1093/bioinformatics/bts505

Abstract

Motivation: During the past years, next-generation sequencing has become a key technology for many applications in the biomedical sciences. Throughput continues to increase and new protocols provide longer reads than currently available. In almost all applications, read mapping is a first step. Hence, it is crucial to have algorithms and implementations that perform fast, with high sensitivity, and are able to deal with long reads and a large absolute number of insertions and deletions. Results: RazerS is a read mapping program with adjustable sensitivity based on counting q-grams. In this work, we propose the successor RazerS 3, which now supports shared-memory parallelism, an additional seed-based filter with adjustable sensitivity, a much faster, banded version of the Myers’ bit-vector algorithm for verification, memory-saving measures and support for the SAM output format. This leads to a much improved performance for mapping reads, in particular, long reads with many errors. We extensively compare RazerS 3 with other popular read mappers and show that its results are often superior to them in terms of sensitivity while exhibiting practical and often competitive run times. In addition, RazerS 3 works without a pre-computed index. Availability and Implementation: Source code and binaries are freely available for download at http://www.seqan.de/projects/razers. RazerS 3 is implemented in C++ and OpenMP under a GPL license using the SeqAn library and supports Linux, Mac OS X and Windows. Contact:david.weese@fu-berlin.de Supplementary information: Supplementary data are available at Bioinformatics online.

Keywords

BIOLOGICAL SCIENCES

This publication has 19 references indexed in Scilit:

Fast gapped-read alignment with Bowtie 2
Nature Methods, 2012
Hobbes: optimized gram-based methods for efficient read alignment
Nucleic Acids Research, 2011
A novel and well-defined benchmarking method for second generation read mapping
BMC Bioinformatics, 2011
SHRiMP2: Sensitive yet Practical Short Read Mapping
Bioinformatics, 2011
Fast Mapping of Short Sequences with Mismatches, Insertions and Deletions Using Index Structures
PLoS Computational Biology, 2009
Personalized copy number and segmental duplication maps using next-generation sequencing
Nature Genetics, 2009
Substantial biases in ultra-short read data sets from high-throughput DNA sequencing
Nucleic Acids Research, 2008
SeqAn An efficient, generic C++ library for sequence analysis
BMC Bioinformatics, 2008
An improved algorithm for matching biological sequences
Published by Elsevier ,2004
Faster Approximate String Matching
Algorithmica, 1999