Assessing the Evolutionary Impact of Amino Acid Mutations in the Human Genome
Top Cited Papers
Open Access
- 30 May 2008
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Genetics
- Vol. 4 (5) , e1000083
- https://doi.org/10.1371/journal.pgen.1000083
Abstract
Quantifying the distribution of fitness effects among newly arising mutations in the human genome is key to resolving important debates in medical and evolutionary genetics. Here, we present a method for inferring this distribution using Single Nucleotide Polymorphism (SNP) data from a population with non-stationary demographic history (such as that of modern humans). Application of our method to 47,576 coding SNPs found by direct resequencing of 11,404 protein coding-genes in 35 individuals (20 European Americans and 15 African Americans) allows us to assess the relative contribution of demographic and selective effects to patterning amino acid variation in the human genome. We find evidence of an ancient population expansion in the sample with African ancestry and a relatively recent bottleneck in the sample with European ancestry. After accounting for these demographic effects, we find strong evidence for great variability in the selective effects of new amino acid replacing mutations. In both populations, the patterns of variation are consistent with a leptokurtic distribution of selection coefficients (e.g., gamma or log-normal) peaked near neutrality. Specifically, we predict 27–29% of amino acid changing (nonsynonymous) mutations are neutral or nearly neutral (|s|s|1%). Our results are consistent with 10–20% of amino acid differences between humans and chimpanzees having been fixed by positive selection with the remainder of differences being neutral or nearly neutral. Our analysis also predicts that many of the alleles identified via whole-genome association mapping may be selectively neutral or (formerly) positively selected, implying that deleterious genetic variation affecting disease phenotype may be missed by this widely used approach for mapping genes underlying complex traits. Although mutations are known to cause varying degrees of harmful effects, it is difficult to quantify the distribution that best describes the variation of fitness effects of these mutations. Here we present a new method for inferring this distribution and inferring population history using Single Nucleotide Polymorphism (SNP) data from human populations. Using 47,576 SNPs discovered in 11,404 genes from sequencing 35 individuals (20 European Americans and 15 African Americans), we find evidence of an ancient population expansion in the sample with African ancestry and a relatively recent bottleneck in the sample with European ancestry. In both populations, the patterns of variation are consistent with a leptokurtic distribution of selection coefficients (e.g., gamma or log-normal) peaked near neutrality. Specifically, we predict 27–29% of amino acid changing (nonsynonymous) mutations are neutral or nearly neutral, 30–42% are moderately deleterious, and nearly all the remainder are highly deleterious or lethal. Furthermore, we infer that 10–20% of amino acid differences between humans and chimpanzees were fixed by positive selection, with the remainder of differences being neutral or nearly neutral.Keywords
This publication has 40 references indexed in Scilit:
- Joint Inference of the Distribution of Fitness Effects of Deleterious Mutations and Population Demography Based on Nucleotide Polymorphism FrequenciesGenetics, 2007
- Recent and ongoing selection in the human genomeNature Reviews Genetics, 2007
- Evolutionary anatomies of positions and types of disease-associated and neutral amino acid mutations in the human genomeBMC Genomics, 2006
- Selection in favor of nucleotides G and C diversifies evolution rates and levels of polymorphism at mammalian synonymous sitesJournal of Theoretical Biology, 2006
- Weak selection and recent mutational changes influence polymorphic synonymous mutations in humansProceedings of the National Academy of Sciences, 2006
- A general multivariate extension of Fisher's geometrical model and the distribution of mutation fitness effects across species.2006
- Population Genetics of Polymorphism and Divergence for Diploid Selection Models With Arbitrary DominanceGenetics, 2004
- Estimating the distribution of fitness effects from DNA sequence data: Implications for the molecular clockProceedings of the National Academy of Sciences, 2003
- Human non-synonymous SNPs: server and surveyNucleic Acids Research, 2002
- Model of effectively neutral mutations in which selective constraint is incorporatedProceedings of the National Academy of Sciences, 1979