Distribution and Characterization of Regulatory Elements in the Human Genome
Open Access
- 1 December 2002
- journal article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 12 (12) , 1827-1836
- https://doi.org/10.1101/gr.606402
Abstract
The regulation of transcription and subsequent gene splicing are crucial to correct gene expression. Although a number of regulatory sequences involved in both processes are known, it is not clear how general their functions are in the genomic context, nor how the regulatory regions are distributed throughout the genome. Here we study the distribution of known mutagenic elements within human introns and exons to deduce the properties of regions essential for splicing and transcription. We show that intronic splicing regulators are generally found close to the splice sites, but may be found as far as 200 nucleotides away from the splice junctions. Similarly, sequences important for splicing may be located as far as 125 nucleotides away from the junctions, within exons. We characterize several types of simple repetitive sequences and low-complexity regions that are overrepresented close to both intron ends and are likely to play important roles in the splicing process. We show that the first introns within most genes play a particularly important regulatory role that is most likely, however, to be involved in transcription control. We also study the distribution of two known regulatory motifs, the GGG trinucleotide and the CpG dinucleotide, and deduce their respective importance to splicing and transcription regulation.Keywords
This publication has 36 references indexed in Scilit:
- The Human Genome Browser at UCSCGenome Research, 2002
- Analysis of the Human Neurexin Genes: Alternative Splicing and the Generation of Protein DiversityGenomics, 2002
- Listening to silence and understanding nonsense: exonic mutations that affect splicingNature Reviews Genetics, 2002
- Comprehensive analysis of CpG islands in human chromosomes 21 and 22Proceedings of the National Academy of Sciences, 2002
- CpG islands and genesPublished by Elsevier ,2002
- A map of human genome sequence variation containing 1.42 million single nucleotide polymorphismsNature, 2001
- Initial sequencing and analysis of the human genomeNature, 2001
- Trinucleotide repeat expansion at the myotonic dystrophy locus reduces expression of DMAHPNature Genetics, 1997
- The origin of interspersed repeats in the human genomeCurrent Opinion in Genetics & Development, 1996
- G + C-rich tract in 5′ end of human intronsJournal of Molecular Biology, 1992