Improving the Practical Space and Time Efficiency of the Shortest-Paths Approach to Sum-of-Pairs Multiple Sequence Alignment

Abstract
The MSA program, written and distributed in 1989, is one of the few existing programs that attempts to find optimal alignments of multiple protein or DNA sequences. The MSA program implements a branch-and-bound technique together with a variant of Dijkstra's shortest paths algorithm to prune the basic dynamic programming graph. We have made substantial improvements in the time and space usage of MSA. The improvements make feasible a variety of problem instances that were not feasible previously. On some runs we achieve an order of magnitude reduction in space usage and a significant multiplicative factor speedup in running time. To explain how these improvements work, we give a much more detailed description of MSA than has been previously available. In practice, MSA rarely produces a provably optimal alignment and we explain why.

This publication has 12 references indexed in Scilit: