Digital RNA sequencing minimizes sequence-dependent bias and amplification noise with optimized single-molecule barcodes
- 9 January 2012
- journal article
- research article
- Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences
- Vol. 109 (4) , 1347-1352
- https://doi.org/10.1073/pnas.1118018109
Abstract
RNA sequencing (RNA-Seq) is a powerful tool for transcriptome profiling, but is hampered by sequence-dependent bias and inaccuracy at low copy numbers intrinsic to exponential PCR amplification. We developed a simple strategy for mitigating these complications, allowing truly digital RNA-Seq. Following reverse transcription, a large set of barcode sequences is added in excess, and nearly every cDNA molecule is uniquely labeled by random attachment of barcode sequences to both ends. After PCR, we applied paired-end deep sequencing to read the two barcodes and cDNA sequences. Rather than counting the number of reads, RNA abundance is measured based on the number of unique barcode sequences observed for a given cDNA sequence. We optimized the barcodes to be unambiguously identifiable, even in the presence of multiple sequencing errors. This method allows counting with single-copy resolution despite sequence-dependent bias and PCR-amplification noise, and is analogous to digital PCR but amendable to quantifying a whole transcriptome. We demonstrated transcriptome profiling of Escherichia coli with more accurate and reproducible quantification than conventional RNA-Seq.Keywords
This publication has 34 references indexed in Scilit:
- Bias detection and correction in RNA-Sequencing dataBMC Bioinformatics, 2011
- Barcoding bias in high-throughput multiplex sequencing of miRNAGenome Research, 2011
- Detection and quantification of rare mutations with massively parallel sequencingProceedings of the National Academy of Sciences, 2011
- Counting individual DNA molecules by the stochastic attachment of diverse labelsProceedings of the National Academy of Sciences, 2011
- A method for counting PCR template molecules with application to next-generation sequencingNucleic Acids Research, 2011
- Nascent transcript sequencing visualizes transcription at nucleotide resolutionNature, 2011
- EcoCyc: a comprehensive database of Escherichia coli biologyNucleic Acids Research, 2010
- Detection of splice junctions from paired-end RNA-seq data by SpliceMapNucleic Acids Research, 2010
- FRT-seq: amplification-free, strand-specific transcriptome sequencingNature Methods, 2010
- Direct RNA sequencingNature, 2009