Tetranucleotide frequencies in microbial genomes
- 14 April 1998
- journal article
- research article
- Published by Wiley in Electrophoresis
- Vol. 19 (4) , 528-535
- https://doi.org/10.1002/elps.1150190412
Abstract
A computational strategy for determining the variability of long DNA sequences in microbial genomes is described. Composite portraits of bacterial genomes were obtained by computing tetranucleotide frequencies of sections of genomic DNA, converting the frequencies to color images and arranging the images according to their genetic position. The resulting images revealed that the tetranucleotide frequencies of genomic DNA sequences are highly conserved. Sections that were visibly different from those of the rest of the genome contained ribosomal RNA, bacteriophage, or undefined coding regions and had corresponding differences in the variances of tetranucleotide frequencies and GC content. Comparison of nine completely sequenced bacterial genomes showed that there was a nonlinear relationship between variances of the tetranucleotide frequencies and GC content, with the highest variances occurring in DNA sequences with low GC contents (less than 0.30 mol). High variances were also observed in DNA sequences having high GC contents (greater than 0.60 mol), but to a much lesser extent than DNA sequences having low GC contents. Differences in the tetranucleotide frequencies may be due to the mechanisms of intercellular genetic exchange and/or processes involved in maintaining intracellular genetic stability. Identification of sections that were different from those of the rest of the genome may provide information on the evolution and plasticity of bacterial genomes.Keywords
This publication has 31 references indexed in Scilit:
- Complete Genome Sequence of the Methanogenic Archaeon, Methanococcus jannaschii Science, 1996
- Sequence Analysis of the Genome of the Unicellular Cyanobacterium Synechocystis sp. Strain PCC6803. II. Sequence Determination of the Entire Genome and Assignment of Potential Protein-coding RegionsDNA Research, 1996
- The Minimal Gene Complement of Mycoplasma genitaliumScience, 1995
- Whole-Genome Random Sequencing and Assembly of Haemophilus influenzae RdScience, 1995
- The generation of variation in bacterial genomesJournal of Molecular Evolution, 1995
- COMPUTATIONAL DNA SEQUENCE ANALYSISAnnual Review of Microbiology, 1994
- DNA mismatch correction by Very Short Patch repair may have altered the abundance of oligonucleotides in theE. coligenomeNucleic Acids Research, 1992
- Statistical evaluation and biological interpretation of non-random abundance in theE.coliK-12 genome of tetra-and pentanucleotide sequences related to VSP DNA mismatch repairNucleic Acids Research, 1992
- Predicting DNA duplex stability from the base sequence.Proceedings of the National Academy of Sciences, 1986
- Energetics of intercalation specificity. I. Backbone unwindingBiopolymers, 1979