The performance of several multiple-sequence alignment programs in relation to secondary-structure features for an rRNA sequence.
Open Access
- 1 April 2000
- journal article
- research article
- Published by Oxford University Press (OUP) in Molecular Biology and Evolution
- Vol. 17 (4) , 530-539
- https://doi.org/10.1093/oxfordjournals.molbev.a026333
Abstract
The performances of five global multiple-sequence alignment programs (CLUSTAL W, Divide and Conquer, Malign, PileUp, and TreeAlign) were evaluated using part of the animal mitochondrial small subunit (12S) rRNA molecule. Conserved sequence motifs derived from an alignment based on secondary structural information were used to score how well each program aligned a data set of five vertebrate and five invertebrate taxa over a range of parameter values. All of the programs could align the motifs with reasonable accuracy for at least one set of parameter conditions, although if the whole sequence was considered, similarity to the structural alignment was only 25%-34%. Use of small gap costs generally gave more accurate results, although Malign and TreeAlign generated longer alignments when gap costs were low. The programs differed in the consistency of the alignments when gap cost was varied; CLUSTAL W, Divide and Conquer, and TreeAlign were the most accurate and robust, while PileUp performed poorly as gap cost values increased, and the accuracy of Malign fluctuated. Default settings for the programs did not give the best results, and attempting to select similar parameter values in different programs did not always result in more similar alignments. Poor alignment of even well-conserved motifs can occur if these are near sites with insertions or deletions. Since there is no a priori way to determine gap costs and because such costs can vary over the gene, alignment of rRNA sequences, particularly the less well conserved regions, should be treated carefully and aided by secondary structure and conserved motifs. Some motifs are single bases and so are often invisible to alignment programs. Our tests involved the most conserved regions of the 12S rRNA gene, and alignment of less well conserved regions will be more problematical. None of the alignments we examined produced a fully resolved phylogeny for the data set, indicating that this portion of 12S rRNA is insufficient for resolution of distant evolutionary relationships.Keywords
This publication has 21 references indexed in Scilit:
- Database on the structure of large subunit ribosomal RNA.Nucleic Acids Research, 1999
- SplitsTree: analyzing and visualizing evolutionary data.Bioinformatics, 1998
- Conserved sequence motifs, alignment, and secondary structure for the third domain of animal 12S rRNAMolecular Biology and Evolution, 1996
- Collection of small subunit (16S- and 16S-like) ribosomal RNA structures: 1994Nucleic Acids Research, 1994
- Alignment-Ambiguous Nucleotide Sites and the Exclusion of Systematic DataMolecular Phylogenetics and Evolution, 1993
- Mix'n'Match: an improved multiple sequence alignment procedure for distantly related proteins using secondary structure predictions, designed to be independent of the choice of gap penalty and scoring matrixProtein Engineering, Design and Selection, 1993
- [39] Unified approach to alignment and phylogeniesPublished by Elsevier ,1990
- The Multiple Sequence Alignment Problem in BiologySIAM Journal on Applied Mathematics, 1988
- A strategy for the rapid multiple alignment of protein sequencesJournal of Molecular Biology, 1987
- Progressive sequence alignment as a prerequisitetto correct phylogenetic treesJournal of Molecular Evolution, 1987