Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals
Top Cited Papers
Open Access
- 1 February 2009
- journal article
- research article
- Published by Springer Nature in Nature
- Vol. 458 (7235) , 223-227
- https://doi.org/10.1038/nature07672
Abstract
Mammalian genomes are transcribed to produce numerous large non-coding RNAs, but their function is unclear, primarily because these transcripts show little or no evidence of evolutionary conservation. A new approach to characterizing these mysterious molecules has now moved the field on. Rather than targeting the RNA molecules themselves, their existence was revealed as chromatin modifications or epigenomic marks in the DNA of four mouse cell types. The search yielded over a thousand large multi-exonic transcriptional units that do not overlap known protein-coding loci and are highly conserved. Possible functions could be assigned to each of these large intervening non-coding RNAs (or lincRNAs), ranging from embryonic stem cell pluripotency to cell proliferation. Specific lincRNAs turn out to be regulated by transcription factors that are key in these processes including p53, NFκB, Sox2, Oct4, and Nanog — and most of these lincRNAs are conserved across mammals. This study uses chromatin marks in four mouse cell types to identify ∼1,600 large multi-exonic transcriptional units that do not overlap known protein-coding loci and are highly conserved. Putative functions are assigned to each of these large intervening non-coding RNAs, which range from ES pluripotency to cell proliferation. There is growing recognition that mammalian cells produce many thousands of large intergenic transcripts1,2,3,4. However, the functional significance of these transcripts has been particularly controversial. Although there are some well-characterized examples, most (>95%) show little evidence of evolutionary conservation and have been suggested to represent transcriptional noise5,6. Here we report a new approach to identifying large non-coding RNAs using chromatin-state maps to discover discrete transcriptional units intervening known protein-coding loci. Our approach identified ∼1,600 large multi-exonic RNAs across four mouse cell types. In sharp contrast to previous collections, these large intervening non-coding RNAs (lincRNAs) show strong purifying selection in their genomic loci, exonic sequences and promoter regions, with greater than 95% showing clear evolutionary conservation. We also developed a functional genomics approach that assigns putative functions to each lincRNA, demonstrating a diverse range of roles for lincRNAs in processes from embryonic stem cell pluripotency to cell proliferation. We obtained independent functional validation for the predictions for over 100 lincRNAs, using cell-based assays. In particular, we demonstrate that specific lincRNAs are transcriptionally regulated by key transcription factors in these processes such as p53, NFκB, Sox2, Oct4 (also known as Pou5f1) and Nanog. Together, these results define a unique collection of functional lincRNAs that are highly conserved and implicated in diverse biological processes.Keywords
This publication has 31 references indexed in Scilit:
- Pseudogene-derived small interfering RNAs regulate gene expression in mouse oocytesNature, 2008
- Distinguishing protein-coding and noncoding genes in the human genomeProceedings of the National Academy of Sciences, 2007
- Revisiting the protein-coding gene catalog ofDrosophila melanogasterusing 12 fly genomesGenome Research, 2007
- Genome-wide maps of chromatin state in pluripotent and lineage-committed cellsNature, 2007
- Functional Demarcation of Active and Silent Chromatin Domains in Human HOX Loci by Noncoding RNAsCell, 2007
- Functionality or transcriptional noise? Evidence for selection within long noncoding RNAsGenome Research, 2007
- Dissecting self-renewal in stem cells with RNA interferenceNature, 2006
- The Oct4 and Nanog transcription network regulates pluripotency in mouse embryonic stem cellsNature Genetics, 2006
- Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profilesProceedings of the National Academy of Sciences, 2005
- Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomesGenome Research, 2005