Identification of higher-order functional domains in the human ENCODE regions
- 13 June 2007
- journal article
- research article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 17 (6) , 917-927
- https://doi.org/10.1101/gr.6081407
Abstract
It has long been posited that human and other large genomes are organized into higher-order (i.e., greater than gene-sized) functional domains. We hypothesized that diverse experimental data types generated by The ENCODE Project Consortium could be combined to delineate active and quiescent or repressed functional domains and thereby illuminate the higher-order functional architecture of the genome. To address this, we coupled wavelet analysis with hidden Markov models for unbiased discovery of “domain-level” behavior in high-resolution functional genomic data, including activating and repressive histone modifications, RNA output, and DNA replication timing. We find that higher-order patterns in these data types are largely concordant and may be analyzed collectively in the context of HeLa cells to delineate 53 active and 62 repressed functional domains within the ENCODE regions. Active domains comprise ∼44% of the ENCODE regions but contain ∼75%–80% of annotated genes, transcripts, and CpG islands. Repressed domains are enriched in certain classes of repetitive elements and, surprisingly, in evolutionarily conserved nonexonic sequences. The functional domain structure of the ENCODE regions appears to be largely stable across different cell types. Taken together, our results suggest that higher-order functional domains represent a fundamental organizing principle of human genome architecture.Keywords
This publication has 52 references indexed in Scilit:
- Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot projectNature, 2007
- The landscape of histone modifications across 1% of the human genome in five human cell linesGenome Research, 2007
- Pan-S replication patterns and chromosomal domains defined by genome-tiling arrays of ENCODE genomic areasGenome Research, 2007
- A Bivalent Chromatin Structure Marks Key Developmental Genes in Embryonic Stem CellsCell, 2006
- Long-Range Periodic Patterns in Microbial Genomes Indicate Significant Multi-Scale Chromosomal OrganizationPLoS Computational Biology, 2006
- Repbase Update, a database of eukaryotic repetitive elementsCytogenetic and Genome Research, 2005
- A gene atlas of the mouse and human protein-encoding transcriptomesProceedings of the National Academy of Sciences, 2004
- Alu repeats and human genomic diversityNature Reviews Genetics, 2002
- Initial sequencing and analysis of the human genomeNature, 2001
- A tutorial on hidden Markov models and selected applications in speech recognitionProceedings of the IEEE, 1989