What is a gene, post-ENCODE? History and updated definition
Top Cited Papers
Open Access
- 13 June 2007
- journal article
- review article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 17 (6) , 669-681
- https://doi.org/10.1101/gr.6339607
Abstract
While sequencing of the human genome surprised us with how many protein-coding genes there are, it did not fundamentally change our perspective on what a gene is. In contrast, the complex patterns of dispersed regulation and pervasive transcription uncovered by the ENCODE project, together with non-genic conservation and the abundance of noncoding RNA genes, have challenged the notion of the gene. To illustrate this, we review the evolution of operational definitions of a gene over the past century—from the abstract elements of heredity of Mendel and Morgan to the present-day ORFs enumerated in the sequence databanks. We then summarize the current ENCODE findings and provide a computational metaphor for the complexity. Finally, we propose a tentative update to the definition of a gene: A gene is a union of genomic sequences encoding a coherent set of potentially overlapping functional products. Our definition sidesteps the complexities of regulation and transcription by removing the former altogether from the definition and arguing that final, functional gene products (rather than intermediate transcripts) should be used to group together entities associated with a single gene. It also manifests how integral the concept of biological function is in defining genes.Keywords
This publication has 97 references indexed in Scilit:
- Structured RNAs in the ENCODE selected regions of the human genomeGenome Research, 2007
- Pseudogenes in the ENCODE regions: Consensus annotation, analysis of transcription, and evolutionGenome Research, 2007
- Statistical analysis of the genomic distribution and correlation of regulatory elements in the ENCODE regionsGenome Research, 2007
- The implications of alternative splicing in the ENCODE protein complementProceedings of the National Academy of Sciences, 2007
- Assessing the performance of different high-density tiling microarray strategies for mapping transcribed regions of the human genomeGenome Research, 2006
- Fine-scale structural variation of the human genomeNature Genetics, 2005
- Initial sequencing and comparative analysis of the mouse genomeNature, 2002
- Non–coding RNA genes and the modern RNA worldNature Reviews Genetics, 2001
- Initial sequencing and analysis of the human genomeNature, 2001
- The rapeseed mitochondrial gene encoding a homologue of the bacterial protein Ccl1 is divided into two independently transcribed reading framesMolecular Genetics and Genomics, 1996