Estimation of prokaryote genomic DNA G+C content by sequencing universally conserved genes
- 1 May 2006
- journal article
- research article
- Published by Microbiology Society in International Journal of Systematic and Evolutionary Microbiology
- Vol. 56 (5) , 1025-1029
- https://doi.org/10.1099/ijs.0.63903-0
Abstract
Determination of the DNA G+C content of prokaryotic genomes using traditional methods is time-consuming and results may vary from laboratory to laboratory, depending on the technique used. We explored the possibility of extrapolating the genomic DNA G+C content of prokaryotes from gene sequences. For this, 127 universally conserved genes were studied from 50 prokaryotic genomes in the Clusters of Orthologous Groups database. Of these, 57 genes were present as a single copy in the genomes of 157 different prokaryote species available in GenBank. There was a strong correlation [coefficient of determination (r2) >95 %] between the DNA G+C contents of 20 genes and their corresponding genomes. For each of the 157 prokaryotic genomes studied, the DNA G+C content of the 20 genes was used to determine a ‘calculated’ genome DNA G+C content (CGC) and this value was compared with the ‘real’ genome DNA G+C content (RGC). In order to select the most suitable gene for the determination of CGC values, we compared ther2and median mol% difference between CGC and RGC as well as the sensitivity of each gene to provide CGC values for prokaryotic genomes that differ by less than 5 mol% from their RGC. The highly conservedftsYgene (median size 1144 nucleotides), a vertically inherited member of the GTPase superfamily, showed the highestr2value of 0.98, the smallest median mol% difference between CGC and RGC of 1.06 and a sensitivity of 100 %. UsingftsYDNA G+C content values, the CGC values of 100 genomes not included in the calculation ofr2differed by less than 5 mol% from their RGC values. These data suggest that the genomic DNA G+C content of prokaryotes may be estimated easily and reliably from theftsYgene sequence.Keywords
This publication has 40 references indexed in Scilit:
- Comparative genomics, minimal gene-sets and the last universal common ancestorNature Reviews Microbiology, 2003
- Usefulness of rpoB Gene Sequencing for Identification of Afipia and Bosea Species, Including a Strategy for Choosing Discriminative Partial SequencesApplied and Environmental Microbiology, 2003
- RNA polymerase β-subunit-based phylogeny of Ehrlichia spp., Anaplasma spp., Neorickettsia spp. and Wolbachia pipientisInternational Journal of Systematic and Evolutionary Microbiology, 2003
- A rapid method for determining the G+C content of bacterial chromosomes by monitoring fluorescence intensity during DNA denaturation in a capillary tube.International Journal of Systematic and Evolutionary Microbiology, 2000
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Rapid procedure to determine the DNA base composition from small amounts of Gram-positive bacteriaFEMS Microbiology Letters, 1990
- Nucleic acids in the classification of campylobactersEuropean Journal of Clinical Microbiology & Infectious Diseases, 1983
- Determination of DNA base compositions from melting profiles in dilute buffersBiopolymers, 1969
- Determination of the base composition of deoxyribonucleic acid from its thermal denaturation temperatureJournal of Molecular Biology, 1962
- Determination of the base composition of deoxyribonucleic acid from its buoyant density in CsClJournal of Molecular Biology, 1962