Splicing and the Evolution of Proteins in Mammals
Open Access
- 6 February 2007
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Biology
- Vol. 5 (2) , e14
- https://doi.org/10.1371/journal.pbio.0050014
Abstract
It is often supposed that a protein's rate of evolution and its amino acid content are determined by the function and anatomy of the protein. Here we examine an alternative possibility, namely that the requirement to specify in the unprocessed RNA, in the vicinity of intron–exon boundaries, information necessary for removal of introns (e.g., exonic splice enhancers) affects both amino acid usage and rates of protein evolution. We find that the majority of amino acids show skewed usage near intron–exon boundaries, and that differences in the trends for the 2-fold and 4-fold blocks of both arginine and leucine show this to be owing to effects mediated at the nucleotide level. More specifically, there is a robust relationship between the extent to which an amino acid is preferred/avoided near boundaries and its enrichment/paucity in splice enhancers. As might then be expected, the rate of evolution is lowest near intron–exon boundaries, at least in part owing to splice enhancers, such that domains flanking intron–exon junctions evolve on average at under half the rate of exon centres from the same gene. In contrast, the rate of evolution of intronless retrogenes is highest near the domains where intron–exon junctions previously resided. The proportion of sequence near intron–exon boundaries is one of the stronger predictors of a protein's rate of evolution in mammals yet described. We conclude that after intron insertion selection favours modification of amino acid content near intron–exon junctions, so as to enable efficient intron removal, these changes then being subject to strong purifying selection even if nonoptimal for protein function. Thus there exists a strong force operating on protein evolution in mammals that is not explained directly in terms of the biology of the protein. Most of the DNA in our genes is actually not involved in the specification of proteins. Rather, the bits with the protein-coding information (exons) are separated from each other by noncoding bits, introns. Before a gene can be translated into protein these introns are removed and the exons are spliced back together to be translated into protein. While information about which DNA to remove is largely in the introns themselves, parts of the exons near the intron–exon boundary can, for example, function as splice enhancer elements. In principle, then, these parts of exons have two functions: to specify the amino acids of the resulting protein and to enable the correct removal of introns. What impact might this have on a gene's evolution? We show that near intron–exon boundaries, amino acid usage is biased towards nucleotides involved in splice control. Moreover, these parts of genes evolve especially slowly. Indeed, we estimate that a gene with many exons would evolve at under half the rate of the same gene with no introns, simply owing to the need to specify where to remove introns. Likewise, genes that have lost their introns evolve especially fast near the former intron's location. Thus, human proteins may not be as optimised as they could be, as their sequence is serving two conflicting roles.Keywords
This publication has 34 references indexed in Scilit:
- Evolutionary and Physiological Importance of Hub ProteinsPLoS Computational Biology, 2006
- Evidence for Purifying Selection Against Synonymous Mutations in Mammalian Exonic Splicing EnhancersMolecular Biology and Evolution, 2005
- Biased codon usage near intron-exon junctions: selection on splicing enhancers, splice-site recognition or something else?Trends in Genetics, 2005
- Significant Impact of Protein Dispensability on the Instantaneous Rate of Protein EvolutionMolecular Biology and Evolution, 2005
- Single Nucleotide Polymorphism–Based Validation of Exonic Splicing EnhancersPLoS Biology, 2004
- A gene atlas of the mouse and human protein-encoding transcriptomesProceedings of the National Academy of Sciences, 2004
- MUSCLE: multiple sequence alignment with high accuracy and high throughputNucleic Acids Research, 2004
- An Analysis of Determinants of Amino Acids Substitution Rates in Bacterial ProteinsMolecular Biology and Evolution, 2004
- Rate of evolution and gene dispensabilityNature, 2003
- Do essential genes evolve slowly?Current Biology, 1999