DIALIGN P: Fast pair-wise and multiple sequence alignment using parallel processors
Open Access
- 9 September 2004
- journal article
- research article
- Published by Springer Nature in BMC Bioinformatics
- Vol. 5 (1) , 128
- https://doi.org/10.1186/1471-2105-5-128
Abstract
Background: Parallel computing is frequently used to speed up computationally expensive tasks in Bioinformatics. Results: Herein, a parallel version of the multi-alignment program DIALIGN is introduced. We propose two ways of dividing the program into independent sub-routines that can be run on different processors: (a) pair-wise sequence alignments that are used as a first step to multiple alignment account for most of the CPU time in DIALIGN. Since alignments of different sequence pairs are completely independent of each other, they can be distributed to multiple processors without any effect on the resulting output alignments. (b) For alignments of large genomic sequences, we use a heuristics by splitting up sequences into sub-sequences based on a previously introduced anchored alignment procedure. For our test sequences, this combined approach reduces the program running time of DIALIGN by up to 97%. Conclusions: By distributing sub-routines to multiple processors, the running time of DIALIGN can be crucially improved. With these improvements, it is possible to apply the program in large-scale genomics and proteomics projects that were previously beyond its scope.Keywords
This publication has 27 references indexed in Scilit:
- Benchmarking tools for the alignment of functional noncoding DNABMC Bioinformatics, 2004
- An applications-focused review of comparative genomics tools: Capabilities, limitations and future challengesBriefings in Bioinformatics, 2003
- Quality assessment of multiple alignment programsFEBS Letters, 2002
- Efficient multiple genome alignmentBioinformatics, 2002
- Exon discovery by genomic sequence alignmentBioinformatics, 2002
- Recent progress in multiple sequence alignment: a surveyPharmacogenomics, 2001
- Comparison of genomic DNA sequences: solved and unsolved problemsBioinformatics, 2001
- Analysis of vertebrate SCL loci identifies conserved enhancersNature Biotechnology, 2000
- DIALIGN 2: improvement of the segment-to-segment approach to multiple sequence alignment.Bioinformatics, 1999
- Microbial gene identification using interpolated Markov modelsNucleic Acids Research, 1998