The generation and utilization of a cancer-oriented representation of the human transcriptome by using expressed sequence tags
- 30 October 2003
- journal article
- research article
- Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences
- Vol. 100 (23) , 13418-13423
- https://doi.org/10.1073/pnas.1233632100
Abstract
Whereas genome sequencing defines the genetic potential of an organism, transcript sequencing defines the utilization of this potential and links the genome with most areas of biology. To exploit the information within the human genome in the fight against cancer, we have deposited some two million expressed sequence tags (ESTs) from human tumors and their corresponding normal tissues in the public databases. The data currently define ≈23,500 genes, of which only ≈1,250 are still represented only by ESTs. Examination of the EST coverage of known cancer-related (CR) genes reveals that <1% do not have corresponding ESTs, indicating that the representation of genes associated with commonly studied tumors is high. The careful recording of the origin of all ESTs we have produced has enabled detailed definition of where the genes they represent are expressed in the human body. More than 100,000 ESTs are available for seven tissues, indicating a surprising variability of gene usage that has led to the discovery of a significant number of genes with restricted expression, and that may thus be therapeutically useful. The ESTs also reveal novel nonsynonymous germline variants (although the one-pass nature of the data necessitates careful validation) and many alternatively spliced transcripts. Although widely exploited by the scientific community, vindicating our totally open source policy, the EST data generated still provide extensive information that remains to be systematically explored, and that may further facilitate progress toward both the understanding and treatment of human cancers.Keywords
This publication has 33 references indexed in Scilit:
- Selecting for Functional Alternative Splices in ESTsGenome Research, 2002
- Long-Range Heterogeneity at the 3′ Ends of Human mRNAsGenome Research, 2002
- An international database and integrated analysis tools for the study of cancer gene expressionThe Pharmacogenomics Journal, 2002
- Alternative splicing and genome complexityNature Genetics, 2001
- The Sequence of the Human GenomeScience, 2001
- Initial sequencing and analysis of the human genomeNature, 2001
- Genome-wide analysis of single-nucleotide polymorphisms in human expressed sequencesNature Genetics, 2000
- Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arraysNature Biotechnology, 2000
- New opportunities for uncovering the molecular basis of cancerNature Genetics, 1997
- Use of a cDNA microarray to analyse gene expression patterns in human cancerNature Genetics, 1996