In vivo enhancer analysis of human conserved non-coding sequences
Top Cited Papers
- 5 November 2006
- journal article
- research article
- Published by Springer Nature in Nature
- Vol. 444 (7118) , 499-502
- https://doi.org/10.1038/nature05295
Abstract
Identifying the non-coding DNA sequences that act at a distance to regulate patterns of gene expression is not a simple matter; one useful pointer is evolutionary sequence conservation. An in vivo analysis of 167 non-coding elements in the human genome that are extremely conserved based on comparisons with pufferfish, rat and mouse genomes, has identified 75 previously unknown tissue-specific enhancers. These are active in embryos on day 11, most of them directing expression in the developing nervous system. The success of this method suggests that the further 5,500 non-coding sequences conserved between humans and pufferfish may yield another new batch of gene enhancers. Identifying the sequences that direct the spatial and temporal expression of genes and defining their function in vivo remains a significant challenge in the annotation of vertebrate genomes. One major obstacle is the lack of experimentally validated training sets. In this study, we made use of extreme evolutionary sequence conservation as a filter to identify putative gene regulatory elements, and characterized the in vivo enhancer activity of a large group of non-coding elements in the human genome that are conserved in human–pufferfish, Takifugu (Fugu) rubripes, or ultraconserved1 in human–mouse–rat. We tested 167 of these extremely conserved sequences in a transgenic mouse enhancer assay. Here we report that 45% of these sequences functioned reproducibly as tissue-specific enhancers of gene expression at embryonic day 11.5. While directing expression in a broad range of anatomical structures in the embryo, the majority of the 75 enhancers directed expression to various regions of the developing nervous system. We identified sequence signatures enriched in a subset of these elements that targeted forebrain expression, and used these features to rank all ∼3,100 non-coding elements in the human genome that are conserved between human and Fugu. The testing of the top predictions in transgenic mice resulted in a threefold enrichment for sequences with forebrain enhancer activity. These data dramatically expand the catalogue of human gene enhancers that have been characterized in vivo, and illustrate the utility of such training sets for a variety of biological applications, including decoding the regulatory vocabulary of the human genome.Keywords
This publication has 28 references indexed in Scilit:
- Close sequence comparisons are sufficient to identify human cis-regulatory elementsGenome Research, 2006
- Control of Developmental Regulators by Polycomb in Human Embryonic Stem CellsCell, 2006
- Mapping cis-regulatory domains in the human genome using multi-species conservation of syntenyHuman Molecular Genetics, 2005
- Evolution at Two Levels: On Genes and FormPLoS Biology, 2005
- A common sex-dependent mutation in a RET enhancer underlies Hirschsprung disease riskNature, 2005
- Highly Conserved Non-Coding Sequences Are Associated with Vertebrate DevelopmentPLoS Biology, 2004
- Comparative genomics at the vertebrate extremesNature Reviews Genetics, 2004
- Transcription regulation and animal diversityNature, 2003
- Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies 1 1Edited by G. von HeijneJournal of Molecular Biology, 1998
- Mutations in the SALL1 putative transcription factor gene cause Townes-Brocks syndromeNature Genetics, 1998