RAxML-Cell: Parallel Phylogenetic Tree Inference on the Cell Broadband Engine

Abstract
Computational phylogeny is a challenging application even for the most powerful supercomputers. It is also an ideal candidate for benchmarking emerging multiprocessor architectures, because it exhibits fine- and coarse-grain parallelism at multiple levels. In this paper, we present the porting, optimization, and evaluation of RAxML on the cell broadband engine. RAxML is a provably efficient, hill climbing algorithm for computing phylogenetic trees, based on the maximum likelihood (ML) method. The cell broadband engine, a heterogeneous multi-core processor with SIMD accelerators which was initially marketed for set-top boxes, is currently being deployed on supercomputers and high-end server architectures. We present both conventional and unconventional, cell-specific optimizations for RAxML's search algorithm on a real cell multiprocessor. While exploring these optimizations, we present solutions to problems related to floating point code execution, complex control flow, communication, scheduling, and multilevel parallelization on the cell.