RAxML-Light: a tool for computing terabyte phylogenies
Open Access
- 24 May 2012
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 28 (15) , 2064-2066
- https://doi.org/10.1093/bioinformatics/bts309
Abstract
Motivation: Due to advances in molecular sequencing and the increasingly rapid collection of molecular data, the field of phyloinformatics is transforming into a computational science. Therefore, new tools are required that can be deployed in supercomputing environments and that scale to hundreds or thousands of cores. Results: We describe RAxML-Light, a tool for large-scale phylogenetic inference on supercomputers under maximum likelihood. It implements a light-weight checkpointing mechanism, deploys 128-bit (SSE3) and 256-bit (AVX) vector intrinsics, offers two orthogonal memory saving techniques and provides a fine-grain production-level message passing interface parallelization of the likelihood function. To demonstrate scalability and robustness of the code, we inferred a phylogeny on a simulated DNA alignment (1481 taxa, 20 000 000 bp) using 672 cores. This dataset requires one terabyte of RAM to compute the likelihood score on a single tree. Code Availability:https://github.com/stamatak/RAxML-Light-1.0.5 Data Availability:http://www.exelixis-lab.org/onLineMaterial.tar.bz2 Contact:alexandros.stamatakis@h-its.org Supplementary Information:Supplementary data are available at Bioinformatics online.Keywords
This publication has 7 references indexed in Scilit:
- The Multi-Processor Scheduling Problem in PhylogeneticsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2012
- Algorithms, data structures, and numerics for likelihood-based phylogenetic inference of huge treesBMC Bioinformatics, 2011
- BEAGLE: An Application Programming Interface and High-Performance Computing Library for Statistical PhylogeneticsSystematic Biology, 2011
- Large-scale maximum likelihood-based phylogenetic analysis on the IBM BlueGene/LPublished by Association for Computing Machinery (ACM) ,2007
- Phylogenetic models of rate heterogeneity: a high performance computing perspectivePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2006
- MrBayes 3: Bayesian phylogenetic inference under mixed modelsBioinformatics, 2003
- Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic treesBioinformatics, 1997