FUBAR: A Fast, Unconstrained Bayesian AppRoximation for Inferring Selection
Top Cited Papers
Open Access
- 18 February 2013
- journal article
- Published by Oxford University Press (OUP) in Molecular Biology and Evolution
- Vol. 30 (5) , 1196-1205
- https://doi.org/10.1093/molbev/mst030
Abstract
Model-based analyses of natural selection often categorize sites into a relatively small number of site classes. Forcing each site to belong to one of these classes places unrealistic constraints on the distribution of selection parameters, which can result in misleading inference due to model misspecification. We present an approximate hierarchical Bayesian method using a Markov chain Monte Carlo (MCMC) routine that ensures robustness against model misspecification by averaging over a large number of predefined site classes. This leaves the distribution of selection parameters essentially unconstrained, and also allows sites experiencing positive and purifying selection to be identified orders of magnitude faster than by existing methods. We demonstrate that popular random effects likelihood methods can produce misleading results when sites assigned to the same site class experience different levels of positive or purifying selection--an unavoidable scenario when using a small number of site classes. Our Fast Unconstrained Bayesian AppRoximation (FUBAR) is unaffected by this problem, while achieving higher power than existing unconstrained (fixed effects likelihood) methods. The speed advantage of FUBAR allows us to analyze larger data sets than other methods: We illustrate this on a large influenza hemagglutinin data set (3,142 sequences). FUBAR is available as a batch file within the latest HyPhy distribution (http://www.hyphy.org), as well as on the Datamonkey web server (http://www.datamonkey.org/).Keywords
This publication has 45 references indexed in Scilit:
- Evolution of the hepatitis E virus hypervariable regionJournal of General Virology, 2012
- Structure of HIV-1 gp120 V1/V2 domain with broadly neutralizing antibody PG9Nature, 2011
- A Random Effects Branch-Site Model for Detecting Episodic Diversifying SelectionMolecular Biology and Evolution, 2011
- Hemagglutinin Receptor Binding Avidity Drives Influenza A Virus Antigenic DriftScience, 2009
- Structural and functional bases for broad-spectrum neutralization of avian and human influenza A virusesNature Structural & Molecular Biology, 2009
- Models of coding sequence evolutionBriefings in Bioinformatics, 2008
- A Maximum Likelihood Method for Detecting Directional Evolution in Protein Sequences and Its Application to Influenza A VirusMolecular Biology and Evolution, 2008
- Simultaneous amino acid substitutions at antigenic sites drive influenza A hemagglutinin evolutionProceedings of the National Academy of Sciences, 2007
- A Dirichlet process model for detecting positive selection in protein-coding DNA sequencesProceedings of the National Academy of Sciences, 2006
- Evolutionary trees from DNA sequences: A maximum likelihood approachJournal of Molecular Evolution, 1981