Effects of Branch Length Uncertainty on Bayesian Posterior Probabilities for Phylogenetic Hypotheses
Open Access
- 17 July 2007
- journal article
- Published by Oxford University Press (OUP) in Molecular Biology and Evolution
- Vol. 24 (9) , 2108-2118
- https://doi.org/10.1093/molbev/msm141
Abstract
In Bayesian phylogenetics, confidence in evolutionary relationships is expressed as posterior probability—the probability that a tree or clade is true given the data, evolutionary model, and prior assumptions about model parameters. Model parameters, such as branch lengths, are never known in advance; Bayesian methods incorporate this uncertainty by integrating over a range of plausible values given an assumed prior probability distribution for each parameter. Little is known about the effects of integrating over branch length uncertainty on posterior probabilities when different priors are assumed. Here, we show that integrating over uncertainty using a wide range of typical prior assumptions strongly affects posterior probabilities, causing them to deviate from those that would be inferred if branch lengths were known in advance; only when there is no uncertainty to integrate over does the average posterior probability of a group of trees accurately predict the proportion of correct trees in the group. The pattern of branch lengths on the true tree determines whether integrating over uncertainty pushes posterior probabilities upward or downward. The magnitude of the effect depends on the specific prior distributions used and the length of the sequences analyzed. Under realistic conditions, however, even extraordinarily long sequences are not enough to prevent frequent inference of incorrect clades with strong support. We found that across a range of conditions, diffuse priors—either flat or exponential distributions with moderate to large means—provide more reliable inferences than small-mean exponential priors. An empirical Bayes approach that fixes branch lengths at their maximum likelihood estimates yields posterior probabilities that more closely match those that would be inferred if the true branch lengths were known in advance and reduces the rate of strongly supported false inferences compared with fully Bayesian integration.Keywords
This publication has 29 references indexed in Scilit:
- The Posterior and the Prior in Bayesian PhylogeneticsAnnual Review of Ecology, Evolution, and Systematics, 2006
- Approximate Likelihood-Ratio Test for Branches: A Fast, Accurate, and Powerful AlternativeSystematic Biology, 2006
- Posterior propriety and admissibility of hyperpriors in normal hierarchical modelsThe Annals of Statistics, 2005
- Reliability of Bayesian Posterior Probabilities and Bootstrap Frequencies in PhylogeneticsSystematic Biology, 2003
- Comparing Bootstrap and Posterior Probability Values in the Four-Taxon CaseSystematic Biology, 2003
- Bayes or Bootstrap? A Simulation Study Comparing the Performance of Bayesian Markov Chain Monte Carlo Sampling and Bootstrapping in Assessing Phylogenetic ConfidenceMolecular Biology and Evolution, 2003
- Comparison of Bayesian and Maximum Likelihood Bootstrap Measures of Phylogenetic ReliabilityMolecular Biology and Evolution, 2003
- Model Misspecification and Probabilistic Tests of Topology: Evidence from Empirical Data SetsSystematic Biology, 2002
- An Empirical Test of Bootstrapping as a Method for Assessing Confidence in Phylogenetic AnalysisSystematic Biology, 1993
- Approaches for Empirical Bayes Confidence IntervalsJournal of the American Statistical Association, 1990