A Random Effects Branch-Site Model for Detecting Episodic Diversifying Selection
Top Cited Papers
- 13 June 2011
- journal article
- research article
- Published by Oxford University Press (OUP) in Molecular Biology and Evolution
- Vol. 28 (11) , 3033-3043
- https://doi.org/10.1093/molbev/msr125
Abstract
Adaptive evolution frequently occurs in episodic bursts, localized to a few sites in a gene, and to a small number of lineages in a phylogenetic tree. A popular class of "branch-site" evolutionary models provides a statistical framework to search for evidence of such episodic selection. For computational tractability, current branch-site models unrealistically assume that all branches in the tree can be partitioned a priori into two rigid classes-"foreground" branches that are allowed to undergo diversifying selective bursts and "background" branches that are negatively selected or neutral. We demonstrate that this assumption leads to unacceptably high rates of false positives or false negatives when the evolutionary process along background branches strongly deviates from modeling assumptions. To address this problem, we extend Felsenstein's pruning algorithm to allow efficient likelihood computations for models in which variation over branches (and not just sites) is described in the random effects likelihood framework. This enables us to model the process at every branch-site combination as a mixture of three Markov substitution models-our model treats the selective class of every branch at a particular site as an unobserved state that is chosen independently of that at any other branch. When benchmarked on a previously published set of simulated sequences, our method consistently matched or outperformed existing branch-site tests in terms of power and error rates. Using three empirical data sets, previously analyzed for episodic selection, we discuss how modeling assumptions can influence inference in practical situations.Keywords
This publication has 39 references indexed in Scilit:
- CodonTest: Modeling Amino Acid Substitution Preferences in Coding SequencesPLoS Computational Biology, 2010
- Correcting the Bias of Empirical Frequency Parameter Estimators in Codon ModelsPLOS ONE, 2010
- Mutation-selection models of coding sequence evolution with site-heterogeneous amino acid fitness profilesProceedings of the National Academy of Sciences, 2010
- Evolutionary Fingerprinting of GenesMolecular Biology and Evolution, 2009
- Reliabilities of identifying positive selection by the branch-site and the site-prediction methodsProceedings of the National Academy of Sciences, 2009
- Frequent Toggling between Alternative Amino Acids Is Driven by Selection in HIV-1PLoS Pathogens, 2008
- Models of coding sequence evolutionBriefings in Bioinformatics, 2008
- A Maximum Likelihood Method for Detecting Directional Evolution in Protein Sequences and Its Application to Influenza A VirusMolecular Biology and Evolution, 2008
- Asymptotic Properties of Maximum Likelihood Estimators and Likelihood Ratio Tests under Nonstandard ConditionsJournal of the American Statistical Association, 1987
- Evolutionary trees from DNA sequences: A maximum likelihood approachJournal of Molecular Evolution, 1981