Improving the efficiency of SPR moves in phylogenetic tree search methods based on maximum likelihood
Open Access
- 18 October 2005
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 21 (24) , 4338-4347
- https://doi.org/10.1093/bioinformatics/bti713
Abstract
Motivation: Maximum likelihood (ML) methods have become very popular for constructing phylogenetic trees from sequence data. However, despite noticeable recent progress, with large and difficult datasets (e.g. multiple genes with conflicting signals) current ML programs still require huge computing time and can become trapped in bad local optima of the likelihood function. When this occurs, the resulting trees may still show some of the defects (e.g. long branch attraction) of starting trees obtained using fast distance or parsimony programs. Methods: Subtree pruning and regrafting (SPR) topological rearrangements are usually sufficient to intensively search the tree space. Here, we propose two new methods to make SPR moves more efficient. The first method uses a fast distance-based approach to detect the least promising candidate SPR moves, which are then simply discarded. The second method locally estimates the change in likelihood for any remaining potential SPRs, as opposed to globally evaluating the entire tree for each possible move. These two methods are implemented in a new algorithm with a sophisticated filtering strategy, which efficiently selects potential SPRs and concentrates most of the likelihood computation on the promising moves. Results: Experiments with real datasets comprising 35–250 taxa show that, while indeed greatly reducing the amount of computation, our approach provides likelihood values at least as good as those of the best-known ML methods so far and is very robust to poor starting trees. Furthermore, combining our new SPR algorithm with local moves such as PHYML's nearest neighbor interchanges, the time needed to find good solutions can sometimes be reduced even more. Availability: Executables of our SPR program and the used datasets are available for download at Contact:gascuel@lirmm.fr; wim@santafe.eduKeywords
This publication has 28 references indexed in Scilit:
- RAxML-III: a fast program for maximum likelihood-based inference of large phylogenetic treesBioinformatics, 2004
- IQPNNI: Moving Fast Through Tree Space and Stopping in TimeMolecular Biology and Evolution, 2004
- Improvement of Distance-Based Phylogenetic Methods by a Local Maximum Likelihood Approach Using TripletsMolecular Biology and Evolution, 2002
- Fast and Accurate Phylogeny Reconstruction Algorithms Based on the Minimum-Evolution PrincipleJournal of Computational Biology, 2002
- Inferring evolutionary trees with strong combinatorial evidenceTheoretical Computer Science, 2000
- Probability distribution of molecular evolutionary trees: A new method of phylogenetic inferenceJournal of Molecular Evolution, 1996
- Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methodsJournal of Molecular Evolution, 1994
- fastDNAml: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihoodBioinformatics, 1994
- Maximum likelihood inference of protein phylogeny and the origin of chloroplastsJournal of Molecular Evolution, 1990
- Evolutionary trees from DNA sequences: A maximum likelihood approachJournal of Molecular Evolution, 1981