Finding cis-regulatory elements using comparative genomics: Some lessons from ENCODE data
- 13 June 2007
- journal article
- research article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 17 (6) , 775-786
- https://doi.org/10.1101/gr.5592107
Abstract
Identification of functional genomic regions using interspecies comparison will be most effective when the full span of relationships between genomic function and evolutionary constraint are utilized. We find that sets of putative transcriptional regulatory sequences, defined by ENCODE experimental data, have a wide span of evolutionary histories, ranging from stringent constraint shown by deep phylogenetic comparisons to recent selection on lineage-specific elements. This diversity of evolutionary histories can be captured, at least in part, by the suite of available comparative genomics tools, especially after correction for regional differences in the neutral substitution rate. Putative transcriptional regulatory regions show alignability in different clades, and the genes associated with them are enriched for distinct functions. Some of the putative regulatory regions show evidence for recent selection, including a primate-specific, distal promoter that may play a novel role in regulation.Keywords
This publication has 54 references indexed in Scilit:
- Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot projectNature, 2007
- The ENCODE Project at UC Santa CruzNucleic Acids Research, 2006
- ESPERR: Learning strong and weak signals in genomic sequence alignments to identify functional elementsGenome Research, 2006
- DNase-chip: a high-resolution method to identify DNase I hypersensitive sites using tiled microarraysNature Methods, 2006
- A distal enhancer and an ultraconserved exon are derived from a novel retroposonNature, 2006
- A haplotype map of the human genomeNature, 2005
- Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomesGenome Research, 2005
- Statistical significance for genomewide studiesProceedings of the National Academy of Sciences, 2003
- Initial sequencing and comparative analysis of the mouse genomeNature, 2002
- Adaptive protein evolution at the Adh locus in DrosophilaNature, 1991