Empirical evaluation of a prior for Bayesian phylogenetic inference
- 7 October 2008
- journal article
- research article
- Published by The Royal Society in Philosophical Transactions Of The Royal Society B-Biological Sciences
- Vol. 363 (1512) , 4031-4039
- https://doi.org/10.1098/rstb.2008.0164
Abstract
The Bayesian method of phylogenetic inference often produces high posterior probabilities (PPs) for trees or clades, even when the trees are clearly incorrect. The problem appears to be mainly due to large sizes of molecular datasets and to the large-sample properties of Bayesian model selection and its sensitivity to the prior when several of the models under comparison are nearly equally correct (or nearly equally wrong) and are of the same dimension. A previous suggestion to alleviate the problem is to let the internal branch lengths in the tree become increasingly small in the prior with the increase in the data size so that the bifurcating trees are increasingly star-like. In particular, if the internal branch lengths are assigned the exponential prior, the prior meanμ0should approach zero faster than but more slowly than 1/n, wherenis the sequence length. This paper examines the usefulness of this data size-dependent prior using a dataset of the mitochondrial protein-coding genes from the baleen whales, with the prior mean fixed atμ0=0.1n−2/3. In this dataset, phylogeny reconstruction is sensitive to the assumed evolutionary model, species sampling and the type of data (DNA or protein sequences), but Bayesian inference using the default prior attaches high PPs for conflicting phylogenetic relationships. The data size-dependent prior alleviates the problem to some extent, giving weaker support for unstable relationships. This prior may be useful in reducing apparent conflicts in the results of Bayesian analysis or in making the method less sensitive to model violations.Keywords
This publication has 50 references indexed in Scilit:
- Topology-Bayes versus Clade-Bayes in Phylogenetic AnalysisMolecular Biology and Evolution, 2008
- PAML 4: Phylogenetic Analysis by Maximum LikelihoodMolecular Biology and Evolution, 2007
- The Bayesian "Star Paradox" Persists for Long Finite SequencesMolecular Biology and Evolution, 2007
- Deuterostome phylogeny reveals monophyletic chordates and the new phylum XenoturbellidaNature, 2006
- Phylogenetic Tree Construction Using Markov Chain Monte CarloJournal of the American Statistical Association, 2000
- Phylogenetic Tree Construction Using Markov Chain Monte CarloJournal of the American Statistical Association, 2000
- Multiple Comparisons of Log-Likelihoods with Applications to Phylogenetic InferenceMolecular Biology and Evolution, 1999
- Phylogenetic Inference for Binary Data on Dendograms Using Markov Chain Monte CarloJournal of Computational and Graphical Statistics, 1997
- Reversible jump Markov chain Monte Carlo computation and Bayesian model determinationBiometrika, 1995
- Confidence Limits on Phylogenies: An Approach Using the BootstrapEvolution, 1985