A Comprehensive Evaluation of Alignment Algorithms in the Context of RNA-Seq
Open Access
- 26 December 2012
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLOS ONE
- Vol. 7 (12) , e52403
- https://doi.org/10.1371/journal.pone.0052403
Abstract
Transcriptome sequencing (RNA-Seq) overcomes limitations of previously used RNA quantification methods and provides one experimental framework for both high-throughput characterization and quantification of transcripts at the nucleotide level. The first step and a major challenge in the analysis of such experiments is the mapping of sequencing reads to a transcriptomic origin including the identification of splicing events. In recent years, a large number of such mapping algorithms have been developed, all of which have in common that they require algorithms for aligning a vast number of reads to genomic or transcriptomic sequences. Although the FM-index based aligner Bowtie has become a de facto standard within mapping pipelines, a much larger number of possible alignment algorithms have been developed also including other variants of FM-index based aligners. Accordingly, developers and users of RNA-seq mapping pipelines have the choice among a large number of available alignment algorithms. To provide guidance in the choice of alignment algorithms for these purposes, we evaluated the performance of 14 widely used alignment programs from three different algorithmic classes: algorithms using either hashing of the reference transcriptome, hashing of reads, or a compressed FM-index representation of the genome. Here, special emphasis was placed on both precision and recall and the performance for different read lengths and numbers of mismatches and indels in a read. Our results clearly showed the significant reduction in memory footprint and runtime provided by FM-index based aligners at a precision and recall comparable to the best hash table based aligners. Furthermore, the recently developed Bowtie 2 alignment algorithm shows a remarkable tolerance to both sequencing errors and indels, thus, essentially making hash-based aligners obsolete.Keywords
This publication has 20 references indexed in Scilit:
- Fast gapped-read alignment with Bowtie 2Nature Methods, 2012
- RNASEQR—a streamlined and accurate RNA-seq sequence analysis programNucleic Acids Research, 2011
- Comparative analysis of algorithms for next-generation sequencing read alignmentBioinformatics, 2011
- Comparative analysis of RNA-Seq alignment algorithms and the RNA-Seq unified mapper (RUM)Bioinformatics, 2011
- MapSplice: Accurate mapping of RNA-seq reads for splice junction discoveryNucleic Acids Research, 2010
- Fast and accurate long-read alignment with Burrows–Wheeler transformBioinformatics, 2010
- Updates to the RMAP short-read mapping softwareBioinformatics, 2009
- SOAP2: an improved ultrafast tool for short read alignmentBioinformatics, 2009
- TopHat: discovering splice junctions with RNA-SeqBioinformatics, 2009
- RNA-Seq: a revolutionary tool for transcriptomicsNature Reviews Genetics, 2009