A simple model based on mutation and selection explains trends in codon and amino-acid usage and GC composition within and across genomes
Open Access
- 22 March 2001
- journal article
- research article
- Published by Springer Nature in Genome Biology
Abstract
Correlations between genome composition (in terms of GC content) and usage of particular codons and amino acids have been widely reported, but poorly explained. We show here that a simple model of processes acting at the nucleotide level explains codon usage across a large sample of species (311 bacteria, 28 archaea and 257 eukaryotes). The model quantitatively predicts responses (slope and intercept of the regression line on genome GC content) of individual codons and amino acids to genome composition. Codons respond to genome composition on the basis of their GC content relative to their synonyms (explaining 71-87% of the variance in response among the different codons, depending on measure). Amino-acid responses are determined by the mean GC content of their codons (explaining 71-79% of the variance). Similar trends hold for genes within a genome. Position-dependent selection for error minimization explains why individual bases respond differently to directional mutation pressure. Our model suggests that GC content drives codon usage (rather than the converse). It unifies a large body of empirical evidence concerning relationships between GC content and amino-acid or codon usage in disparate systems. The relationship between GC content and codon and amino-acid usage is ahistorical; it is replicated independently in the three domains of living organisms, reinforcing the idea that genes and genomes at mutation/selection equilibrium reproduce a unique relationship between nucleic acid and protein composition. Thus, the model may be useful in predicting amino-acid or nucleotide sequences in poorly characterized taxa.This publication has 81 references indexed in Scilit:
- Studies on the Relationships between the Synonymous Codon Usage and Protein Secondary Structural UnitsBiochemical and Biophysical Research Communications, 2000
- Specific correlations between relative synonymous codon usage and protein secondary structureJournal of Molecular Biology, 1998
- Growth rate-optimised tRNA abundance and codon usageJournal of Molecular Biology, 1997
- Non‐random usage of ‘degenerate’ codons is related to protein three‐dimensional structureFEBS Letters, 1996
- Codon usage and genome evolutionCurrent Opinion in Genetics & Development, 1994
- Codon distribution in vertebrate genes may be used to predict gene lengthJournal of Molecular Biology, 1987
- Coevolution of codon usage and transfer RNA abundanceNature, 1987
- Compositional constraints and genome evolutionJournal of Molecular Evolution, 1986
- An evolutionary perspective on synonymous codon usage in unicellular organismsJournal of Molecular Evolution, 1986
- Correlation between the abundance of yeast transfer RNAs and the occurrence of the respective codons in protein genesJournal of Molecular Biology, 1982