Mammalian Mutation Pressure, Synonymous Codon Choice, and mRNA Degradation

Abstract
The usage of synonymous codons (SCs) in mammalian genes is highly correlated with local base composition and is therefore thought to be determined by mutation pressure. The usage is nonetheless structured. For instance, mammals share with Saccharomyces and Drosophila most preferences for the C-ending over the G-ending codon (or vice versa) within each fourfold-degenerate SC family and the fact that their SCs are placed along coding regions in ways that minimize the number of T|A and C|G dinucleotides (“|” being the codon boundary). TA and CG underrepresentations are observed everywhere in the mammalian genome affecting the SC usage, the amino acid composition of proteins, and the primary structure of introns and noncoding DNA. While the rarity of CG is ascribed to the high mutability of this dinucleotide, the rarity of TA in coding regions is considered adaptive because UA dinucleotides are cleaved by endoribonucleases. Here we present in vivo experimental evidence indicating that the number of T|A and/or C|G dinucleotides of a human gene can affect strongly the expression level and degradation of its mRNA. Our results are consistent with indirect evidence produced by other workers and with the detailed work that has been devoted to characterize UA cleavage in vitro and in vivo. We conclude that SC choice can influence strongly mRNA function and gene expression through effects not directly related to the codon–anticodon interaction. These effects should constrain heavily the nucleotide motif composition of the most abundant mRNAs in the transcriptome, in particular, their SC usage, a usage that must be reflected by cellular tRNA concentrations and thus defines for all other genes which SCs are translated fastest and most accurately. Furthermore, the need to avoid such effects genome-wide appears serious enough to have favored the evolution of biases in context-dependent mutation that reduce the occurrence of intrinsically unfavorable motifs, and/or, when possible, to have induced the molecular machinery mediating such effects to rely opportunistically on already existing motif rarities and abundances. This may explain why nucleotide motif preferences are very similar in transcribed and nontranscribed mammalian DNA even though the preferences appear to be adaptive only in transcribed DNA.