Short fuzzy tandem repeats in genomic sequences, identification, and possible role in regulation of gene expression
- 10 January 2006
- journal article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 22 (6) , 676-684
- https://doi.org/10.1093/bioinformatics/btk032
Abstract
Motivation: Genomic sequences are highly redundant and contain many types of repetitive DNA. Fuzzy tandem repeats (FTRs) are of particular interest. They are found in regulatory regions of eukaryotic genes and are reported to interact with transcription factors. However, accurate assessment of FTR occurrences in different genome segments requires specific algorithm for efficient FTR identification and classification. Results: We have obtained formulas for P-values of FTR occurrence and developed an FTR identification algorithm implemented in TandemSWAN software. Using TandemSWAN we compared the structure and the occurrence of FTRs with short period length (up to 24 bp) in coding and non-coding regions including UTRs, heterochromatic, intergenic and enhancer sequences of Drosophila melanogaster and Drosophila pseudoobscura. Tandems with period three and its multiples were found in coding segments, whereas FTRs with periods multiple of six are overrepresented in all non-coding segment. Periods equal to 5–7 and 11–14 were characteristic of the enhancer regions and other non-coding regions close to genes. Availability: TandemSWAN web page, stand-alone version and documentation can be found at Contacts: valeyo@imb.ac.ru Supplementary information: Supplementary data are available at Bioinformatics online.Keywords
This publication has 48 references indexed in Scilit:
- Evolution of transcription factor DNA binding sitesGene, 2005
- Comparative genome sequencing ofDrosophila pseudoobscura: Chromosomal, gene, andcis-element evolutionGenome Research, 2005
- Evolution and functional classification of vertebrate gene desertsGenome Research, 2004
- Microsatellites: simple sequences with complex evolutionNature Reviews Genetics, 2004
- Sequence Analysis of a Functional Drosophila CentromereGenome Research, 2003
- Tandem repeat of C/EBP binding sites mediates PPAR?2 gene transcription in glucocorticoid-induced adipocyte differentiationJournal of Cellular Biochemistry, 2000
- On the Crucial Stages in the Origin of Animate MatterJournal of Molecular Evolution, 1997
- Genetic variation at five trimeric and tetrameric tandem repeat loci in four human population groupsGenomics, 1992
- Identification of a gene (FMR-1) containing a CGG repeat coincident with a breakpoint cluster region exhibiting length variation in fragile X syndromePublished by Elsevier ,1991
- Molecular drive: a cohesive mode of species evolutionNature, 1982