Cluster analysis of genes in codon space
- 1 June 1984
- journal article
- research article
- Published by Springer Nature in Journal of Molecular Evolution
- Vol. 20 (2) , 167-174
- https://doi.org/10.1007/bf02257377
Abstract
We construct a “codon space” in which a given DNA sequence can be plotted as a function of its base composition in each of the three codon positions. We demonstrate that the base composition is very highly nonrandom, with sequences from more primitive organisms having the least random compositions. By using cluster analysis on the points plotted in codon space we show that there is a strong correlation between base composition and type of organism, with the most primitive organisms having the highest A or T content in the second and third codon positions. A smooth transition toward lower A+T and higher G+C content is observed in the second and third codon positions as the evolutionary complexity of the organism increases. Besides this general trend, more detailed structure can be observed in the clustering that will become clearer as the data base is increased.This publication has 7 references indexed in Scilit:
- A thermodynamic theory of codon bias in viral genesJournal of Theoretical Biology, 1983
- On the informational content of viral DNAJournal of Theoretical Biology, 1983
- Periodic correlations in DNA sequences and evidence suggesting their evolutionary origin in a comma-less genetic codeJournal of Molecular Evolution, 1981
- Codon catalog usage is a genome strategy modulated for gene expressivityNucleic Acids Research, 1981
- Working of the genetic codeTrends in Biochemical Sciences, 1980
- Codon frequencies in 119 individual genes confirm corsistent choices of degenerate bases according to genome typeNucleic Acids Research, 1980
- The Monte Carlo Method of Evaluating IntegralsPublished by Defense Technical Information Center (DTIC) ,1975