Trinucleotide repeats in human genome and exome
Open Access
- 9 March 2010
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 38 (12) , 4027-4039
- https://doi.org/10.1093/nar/gkq127
Abstract
Trinucleotide repeats (TNRs) are of interest in genetics because they are used as markers for tracing genotype–phenotype relations and because they are directly involved in numerous human genetic diseases. In this study, we searched the human genome reference sequence and annotated exons (exome) for the presence of uninterrupted triplet repeat tracts composed of six or more repeated units. A list of 32 448 TNRs and 878 TNR-containing genes was generated and is provided herein. We found that some triplet repeats, specifically CNG, are overrepresented, while CTT, ATC, AAC and AAT are underrepresented in exons. This observation suggests that the occurrence of TNRs in exons is not random, but undergoes positive or negative selective pressure. Additionally, TNR types strongly determine their localization in mRNA sections (ORF, UTRs). Most genes containing exon-overrepresented TNRs are associated with gene ontology-defined functions. Surprisingly, many groups of genes that contain TNR types coding for different homo-amino acid tracts associate with the same transcription-related GO categories. We propose that TNRs have potential to be functional genetic elements and that their variation may be involved in the regulation of many common phenotypes; as such, TNR polymorphisms should be considered a priority in association studies.Keywords
This publication has 80 references indexed in Scilit:
- Structural Diversity of Triplet Repeat RNAsJournal of Biological Chemistry, 2010
- Triplet repeat length bias and variation in the human transcriptomeProceedings of the National Academy of Sciences, 2009
- Systematic and integrative analysis of large gene lists using DAVID bioinformatics resourcesNature Protocols, 2008
- G-quadruplexes: the beginning and end of UTRsNucleic Acids Research, 2008
- Impaired glutathione synthesis in schizophrenia: Convergent genetic and functional evidenceProceedings of the National Academy of Sciences, 2007
- NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteinsNucleic Acids Research, 2007
- Microsatellites: simple sequences with complex evolutionNature Reviews Genetics, 2004
- The UCSC Genome Browser DatabaseNucleic Acids Research, 2003
- Recruitment of human muscleblind proteins to (CUG)n expansions associated with myotonic dystrophyThe EMBO Journal, 2000
- The 1993–94 Généthon human genetic linkage mapNature Genetics, 1994