A comprehensive comparison of multiple sequence alignment programs
Open Access
- 1 July 1999
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 27 (13) , 2682-2690
- https://doi.org/10.1093/nar/27.13.2682
Abstract
In recent years improvements to existing programs and the introduction of new iterative algorithms have changed the state-of-the-art in protein sequence alignment. This paper presents the first systematic study of the most commonly used alignment programs using BAliBASE benchmark alignments as test cases. Even below the ‘twilight zone’ at 10–20% residue identity, the best programs were capable of correctly aligning on average 47% of the residues. We show that iterative algorithms often offer improved alignment accuracy though at the expense of computation time. A notable exception was the effect of introducing a single divergent sequence into a set of closely related sequences, causing the iteration to diverge away from the best alignment. Global alignment programs generally performed better than local methods, except in the presence of large N/C-terminal extensions and internal insertions. In these cases, a local algorithm was more successful in identifying the most conserved motifs. This study enables us to propose appropriate alignment strategies, depending on the nature of a particular set of sequences. The employment of more than one program based on different alignment techniques should significantly improve the quality of automatic protein sequence alignment methods. The results also indicate guidelines for improvement of alignment algorithms.Keywords
This publication has 24 references indexed in Scilit:
- Dynamic sequence databank searching with templates and multiple alignmentJournal of Molecular Biology, 1998
- EbEST: An Automated Tool Using Expressed Sequence Tags to Delineate Gene StructureGenome Research, 1998
- Automated protein sequence database classification. I. Integration of compositional similarity search, local similarity search, and multiple sequence alignment.Bioinformatics, 1998
- Improving the Practical Space and Time Efficiency of the Shortest-Paths Approach to Sum-of-Pairs Multiple Sequence AlignmentJournal of Computational Biology, 1995
- Modular arrangement of proteins as inferred from analysis of homologyProtein Science, 1994
- Basic Local Alignment Search ToolJournal of Molecular Biology, 1990
- A flexible method to align large numbers of biological sequencesJournal of Molecular Evolution, 1988
- Improved tools for biological sequence comparison.Proceedings of the National Academy of Sciences, 1988
- A strategy for the rapid multiple alignment of protein sequencesJournal of Molecular Biology, 1987
- Progressive sequence alignment as a prerequisitetto correct phylogenetic treesJournal of Molecular Evolution, 1987