Mapping and quantifying mammalian transcriptomes by RNA-Seq

Abstract
The mouse transcriptome in three tissue types has been analyzed using Illumina next-generation sequencing technology. This quantitative RNA-Seq methodology has been used for expression analysis and splice isoform discovery and to confirm or extend reference gene models. Also in this issue, another paper reports application of the ABI SOLiD technology to sequence the transcriptome in mouse embryonic stem cells. We have mapped and quantified mouse transcriptomes by deeply sequencing them and recording how frequently each gene is represented in the sequence sample (RNA-Seq). This provides a digital measure of the presence and prevalence of transcripts from known and previously unknown genes. We report reference measurements composed of 41–52 million mapped 25-base-pair reads for poly(A)-selected RNA from adult mouse brain, liver and skeletal muscle tissues. We used RNA standards to quantify transcript prevalence and to test the linear range of transcript detection, which spanned five orders of magnitude. Although >90% of uniquely mapped reads fell within known exons, the remaining data suggest new and revised gene models, including changed or additional promoters, exons and 3′ untranscribed regions, as well as new candidate microRNA precursors. RNA splice events, which are not readily measured by standard gene expression microarray or serial analysis of gene expression methods, were detected directly by mapping splice-crossing sequence reads. We observed 1.45 × 105 distinct splices, and alternative splices were prominent, with 3,500 different genes expressing one or more alternate internal splices.