A Novel Method to Calculate the G+C Content of Genomic DNA Sequences
- 1 October 2001
- journal article
- research article
- Published by Taylor & Francis in Journal of Biomolecular Structure and Dynamics
- Vol. 19 (2) , 333-341
- https://doi.org/10.1080/07391102.2001.10506743
Abstract
The base composition of a DNA fragment or genome is usually measured by the proportion of A or G in the sequence. The G content along genomic sequences is usually calculated using an overlapping or non-overlapping sliding window method. The result and accuracy of such an approach depends on the size of the window and the moving distance adopted. In this paper, a novel windowless technique to calculate the G content of genomic sequences is proposed. By this method, the G content can be calculated at different “resolution”. In an extreme case, the G content may be computed at a specific point, rather than in a window of finite size. This is particularly useful to analyze the fine variation of base composition along genomic sequences. As the first example, the variation of G content along each of 16 yeast chromosomes is analyzed. The Grich regions with length larger than 5 kb sequences are detected and listed in details. It is found that each chromosome consists of several Grich and Gpoor regions alternatively, i.e., a mosaic structure. Another example is to analyze the G content for each of the two chromosomes of the Vibrio cholerae genome. Based on the variations of the G content in each chromosome, it is shown that some fragments in the Vibrio cholerae genome may have been transferred from other species. Especially, the position and size of the large integron island on the smaller chromosome was precisely predicted. This method would be a useful tool for analyzing genomic sequences.Keywords
This publication has 23 references indexed in Scilit:
- The Sequence of the Human GenomeScience, 2001
- Microbial genome analysis: insights into virulence, host adaptation and evolutionNature Reviews Genetics, 2000
- Compositional Properties of Homologous Coding Sequences from PlantsJournal of Molecular Evolution, 1998
- Evolutionary changes in CpG and methylation levels in the genome of vertebratesGene, 1997
- A Symmetrical Theory of DNA Sequences and Its ApplicationsJournal of Theoretical Biology, 1997
- The yeast genome project: what did we learn?Trends in Genetics, 1996
- THE HUMAN GENOME: Organization and Evolutionary HistoryAnnual Review of Genetics, 1995
- Z Curves, An Intutive Tool for Visualizing and Analyzing the DNA SequencesJournal of Biomolecular Structure and Dynamics, 1994
- The complete DNA sequence of yeast chromosome IIINature, 1992
- Directional mutation pressure and neutral molecular evolution.Proceedings of the National Academy of Sciences, 1988