Large-scale production of SAGE libraries from microdissected tissues, flow-sorted cells, and cell lines
Open Access
- 29 November 2006
- journal article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 17 (1) , 108-116
- https://doi.org/10.1101/gr.5488207
Abstract
We describe the details of a serial analysis of gene expression (SAGE) library construction and analysis platform that has enabled the generation of >298 high-quality SAGE libraries and >30 million SAGE tags primarily from sub-microgram amounts of total RNA purified from samples acquired by microdissection. Several RNA isolation methods were used to handle the diversity of samples processed, and various measures were applied to minimize ditag PCR carryover contamination. Modifications in the SAGE protocol resulted in improved cloning and DNA sequencing efficiencies. Bioinformatic measures to automatically assess DNA sequencing results were implemented to analyze the integrity of ditag structure, linker or cross-species ditag contamination, and yield of high-quality tags per sequence read. Our analysis of singleton tag errors resulted in a method for correcting such errors to statistically determine tag accuracy. From the libraries generated, we produced an essentially complete mapping of reliable 21-base-pair tags to the mouse reference genome sequence for a meta-library of ∼5 million tags. Our analyses led us to reject the commonly held notion that duplicate ditags are artifacts. Rather than the usual practice of discarding such tags, we conclude that they should be retained to avoid introducing bias into the results and thereby maintain the quantitative nature of the data, which is a major theoretical advantage of SAGE as a tool for global transcriptional profiling.Keywords
This publication has 35 references indexed in Scilit:
- The Oct4 and Nanog transcription network regulates pluripotency in mouse embryonic stem cellsNature Genetics, 2006
- CAGE: cap analysis of gene expressionNature Methods, 2006
- The RIN: an RNA integrity number for assigning integrity values to RNA measurementsBMC Molecular Biology, 2006
- A Global Map of p53 Transcription-Factor Binding Sites in the Human GenomeCell, 2006
- A mouse atlas of gene expression: Large-scale digital gene-expression profiles from precisely defined developing C57BL/6J mouse tissues and cellsProceedings of the National Academy of Sciences, 2005
- aRNA-longSAGE: a new approach to generate SAGE libraries from microdissected cellsNucleic Acids Research, 2004
- Statistical modeling of sequencing errors in SAGE librariesBioinformatics, 2004
- 5′ Long serial analysis of gene expression (LongSAGE) and 3′ LongSAGE for transcriptome characterization and genome annotationProceedings of the National Academy of Sciences, 2004
- Gene Expression Profiling of Cells, Tissues, and Developmental Stages of the Nematode C. elegansCold Spring Harbor Symposia on Quantitative Biology, 2003
- Two new tools: multi-purpose cloning vectors that carry kanamycin or spectinomycin/streptomycin resistance markersGene, 1988