Building the Tree of Life on Terascale Systems
- 1 January 2007
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
Bayesian phylogenetic inference is an important alternative to maximum likelihood-based phylogenetic method. However, inferring large trees using the Bayesian approach is computationally demanding - requiring huge amounts of memory and months of computational time. With a combination of novel parallel algorithms and latest system technology, terascale phylogenetic tools provide biologists the computational power necessary to conduct experiments on very large dataset, and thus aid construction of the tree of life. In this work we evaluate the performance of PBPI, a parallel application that reconstructs phylogenetic trees using MCMC-based Bayesian methods, on two terascale systems, Blue Gene/L at IBM Rochester and System X at Virginia Tech. Our results confirm that for a benchmark dataset with 218 taxa and 10000 characters, PBPI can achieve linear speedup on 1024 or more processors for both systems.Keywords
This publication has 20 references indexed in Scilit:
- PBPI: a High Performance Implementation of Bayesian Phylogenetic InferencePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2006
- The Ribosomal Database Project (RDP-II): sequences and tools for high-throughput rRNA analysisNucleic Acids Research, 2004
- Parallel Metropolis coupled Markov chain Monte Carlo for Bayesian phylogenetic inferenceBioinformatics, 2004
- Parallel algorithms for Bayesian phylogenetic inferenceJournal of Parallel and Distributed Computing, 2003
- TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computingBioinformatics, 2002
- High-Performance Algorithm Engineering for Computational PhylogeneticsThe Journal of Supercomputing, 2002
- Parallel implementation and performance of fastDNAmlPublished by Association for Computing Machinery (ACM) ,2001
- Approximate methods for estimating the pattern of nucleotide substitution and the variation of substitution rates among sitesMolecular Biology and Evolution, 1996
- Markov Chains for Exploring Posterior DistributionsThe Annals of Statistics, 1994
- fastDNAml: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihoodBioinformatics, 1994