The Importance of Data Partitioning and the Utility of Bayes Factors in Bayesian Phylogenetics
Top Cited Papers
Open Access
- 1 August 2007
- journal article
- Published by Oxford University Press (OUP) in Systematic Biology
- Vol. 56 (4) , 643-655
- https://doi.org/10.1080/10635150701546249
Abstract
As larger, more complex data sets are being used to infer phylogenies, accuracy of these phylogenies increasingly requires models of evolution that accommodate heterogeneity in the processes of molecular evolution. We investigated the effect of improper data partitioning on phylogenetic accuracy, as well as the type I error rate and sensitivity of Bayes factors, a commonly used method for choosing among different partitioning strategies in Bayesian analyses. We also used Bayes factors to test empirical data for the need to divide data in a manner that has no expected biological meaning. Posterior probability estimates are misleading when an incorrect partitioning strategy is assumed. The error was greatest when the assumed model was underpartitioned. These results suggest that model partitioning is important for large data sets. Bayes factors performed well, giving a 5% type I error rate, which is remarkably consistent with standard frequentist hypothesis tests. The sensitivity of Bayes factors was found to be quite high when the across-class model heterogeneity reflected that of empirical data. These results suggest that Bayes factors represent a robust method of choosing among partitioning strategies. Lastly, results of tests for the inclusion of unexpected divisions in empirical data mirrored the simulation results, although the outcome of such tests is highly dependent on accounting for rate variation among classes. We conclude by discussing other approaches for partitioning data, as well as other applications of Bayes factors.Keywords
This publication has 19 references indexed in Scilit:
- Bayesian mixed models and the phylogeny of pitvipers (Viperidae: Serpentes)Molecular Phylogenetics and Evolution, 2006
- Partitioned Bayesian Analyses, Partition Choice, and the Phylogenetic Relationships of Scincid LizardsSystematic Biology, 2005
- Frequentist Properties of Bayesian Posterior Probabilities of Phylogenetic Trees Under Simple and Complex Substitution ModelsSystematic Biology, 2004
- Data Partitions and Complex Models in Bayesian Analysis: The Phylogeny of Gymnophthalmid LizardsSystematic Biology, 2004
- A Bayesian Mixture Model for Across-Site Heterogeneities in the Amino-Acid Replacement ProcessMolecular Biology and Evolution, 2004
- Bayes FactorsJournal of the American Statistical Association, 1995
- Success of Phylogenetic Methods in the Four-Taxon CaseSystematic Biology, 1993
- Cases in which Parsimony or Compatibility Methods Will be Positively MisleadingSystematic Zoology, 1978
- A new look at the statistical model identificationIEEE Transactions on Automatic Control, 1974
- Some Tests of Significance, Treated by the Theory of ProbabilityMathematical Proceedings of the Cambridge Philosophical Society, 1935