Avoidance of palindromic words in bacterial and archaeal genomes: a close connection with restriction enzymes
Open Access
- 1 June 1997
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 25 (12) , 2430-2439
- https://doi.org/10.1093/nar/25.12.2430
Abstract
Short palindromic sequences (4, 5 and 6 bp palindromes) are avoided at a statistically significant level in the genomes of several bacteria, including the completely sequenced Haemophilus influenzae and Synechocystis sp. genomes and in the complete genome of the archaeon Methanococcus jannaschii. In contrast, there is only moderate avoidance of palindromes in the small genome of the bacterium Mycoplasma genitalium and no detectable avoidance in the genomes of chloroplasts and mitochondria. The sites for type II restriction-modification enzymes detected in the given species tend to be among the most avoided palindromes in a particular genome, indicating a direct connection between the avoidance of short oligonucleotide words and restriction-modification systems with the respective specificity. Palindromes corresponding to sites for restriction enzymes from other species are also avoided, albeit less significantly, suggesting that in the course of evolution bacterial DNA has been exposed to a wide spectrum of restriction enzymes, probably as the result of lateral transfer mediated by mobile genetic elements, such as plasmids and prophages. Palindromic words appear to accumulate in DNA once it becomes isolated from restriction-modification systems, as demonstrated by the case of organellar genomes. By combining these observations with protein sequence analysis, we show that the most avoided 4-palindrome and the most avoided 6-palindrome in the archaeon M.jannaschii are likely to be recognition sites for two novel restrictionmodification systems.Keywords
This publication has 30 references indexed in Scilit:
- Frequency and Distribution of DNA Uptake Signal Sequences in the Haemophilus influenzae Rd GenomeScience, 1995
- Singular over-representation of an octameric palindrome, HIP1, in DNA from many cyanobacteriaNucleic Acids Research, 1995
- Structural and functional diversity among bacterial interspersed mosaic elements (BIMEs)Molecular Microbiology, 1994
- Significant Dispersed Recurrent DNA Sequences in the Escherichia coli GenomeJournal of Molecular Biology, 1993
- Extendable words in nucleotide sequencesBioinformatics, 1992
- Statistical analyses of counts and distributions of restriction sites in DNA sequencesNucleic Acids Research, 1992
- Over- and under-representation of short oligonucleotides in DNA sequences.Proceedings of the National Academy of Sciences, 1992
- Linguistics of Nucleotide Sequences I: The Significance of Deviations from Mean Statistical Characteristics and Prediction of the Frequencies of Occurrence of WordsJournal of Biomolecular Structure and Dynamics, 1989
- Linguistics of Nucleotide Sequences II: Stationary Words in Genetic Texts and the Zonal Structure of DNAJournal of Biomolecular Structure and Dynamics, 1989
- Linguistics of Nucleotide Sequences: Morphology and Comparison of VocabulariesJournal of Biomolecular Structure and Dynamics, 1986