New class of gene-termini-associated human RNAs suggests a novel RNA copying mechanism

Abstract
An analysis of short RNAs (with fewer than 200 nucleotides) from human cells using single molecule high-throughput sequencing has uncovered a previously unknown short RNA species with a difference. They all have the same 'tail' at their 5' ends, consisting of a sequence of non-genomically encoded polyU residues. This, together with the finding that these RNAs are closely associated with the 3' ends of known RNAs, points to the existence of a novel RNA-copying mechanism in human cells. In the course of characterizing short RNAs from human cells using single-molecule high-throughput sequencing, these authors identify a new short RNA species. The presence of non-genomically encoded poly(U) residues at their 5' ends implies the existence of an unknown RNA copying mechanism in human cells. Small (<200 nucleotide) RNA (sRNA) profiling of human cells using various technologies demonstrates unexpected complexity of sRNAs with hundreds of thousands of sRNA species present1,2,3,4. Genetic and in vitro studies show that these RNAs are not merely degradation products of longer transcripts but could indeed have a function1,2,5. Furthermore, profiling of RNAs, including the sRNAs, can reveal not only novel transcripts, but also make clear predictions about the existence and properties of novel biochemical pathways operating in a cell. For example, sRNA profiling in human cells indicated the existence of an unknown capping mechanism operating on cleaved RNA2, a biochemical component of which was later identified6. Here we show that human cells contain a novel type of sRNA that has non-genomically encoded 5′ poly(U) tails. The presence of these RNAs at the termini of genes, specifically at the very 3′ ends of known mRNAs, strongly argues for the presence of a yet uncharacterized endogenous biochemical pathway in cells that can copy RNA. We show that this pathway can operate on multiple genes, with specific enrichment towards transcript-encoding components of the translational machinery. Finally, we show that genes are also flanked by sense, 3′ polyadenylated sRNAs that are likely to be capped.