Using the transcriptome to annotate the genome
Top Cited Papers
- 1 May 2002
- journal article
- Published by Springer Nature in Nature Biotechnology
- Vol. 20 (5) , 508-512
- https://doi.org/10.1038/nbt0502-508
Abstract
A remaining challenge for the human genome project involves the identification and annotation of expressed genes. The public and private sequencing efforts have identified ∼ 15,000 sequences that meet stringent criteria for genes, such as correspondence with known genes from humans or other species, and have made another ∼ 10,000–20,000 gene predictions of lower confidence, supported by various types of in silico evidence, including homology studies, domain searches, and ab initio gene predictions1,2. These computational methods have limitations, both because they are unable to identify a significant fraction of genes and exons and because they are unable to provide definitive evidence about whether a hypothetical gene is actually expressed3,4. As the in silico approaches identified a smaller number of genes than anticipated5,6,7,8,9, we wondered whether high-throughput experimental analyses could be used to provide evidence for the expression of hypothetical genes and to reveal previously undiscovered genes. We describe here the development of such a method—called long serial analysis of gene expression (LongSAGE), an adaption of the original SAGE approach10—that can be used to rapidly identify novel genes and exons.Keywords
This publication has 17 references indexed in Scilit:
- The Sequence of the Human GenomeScience, 2001
- Initial sequencing and analysis of the human genomeNature, 2001
- Identification of human chromosome 22 transcribed sequences with ORF expressed sequence tagsProceedings of the National Academy of Sciences, 2000
- An Assessment of Gene Prediction Accuracy in Large DNA SequencesGenome Research, 2000
- Gene Index analysis of the human genome estimates approximately 120,000 genesNature Genetics, 2000
- The DNA sequence of human chromosome 22Nature, 1999
- Analysis of human transcriptomesNature Genetics, 1999
- Late-Night Thoughts on the Sequence Annotation ProblemGenome Research, 1998
- Serial Analysis of Gene ExpressionScience, 1995
- How many genes in the human genome?Nature Genetics, 1994