Finding the most significant common sequence and structure motifs in a set of RNA sequences
Open Access
- 1 September 1997
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 25 (18) , 3724-3732
- https://doi.org/10.1093/nar/25.18.3724
Abstract
We present a computational scheme to locally align a collection of RNA sequences using sequence and structure constraints. In addition, the method searches for the resulting alignments with the most significant common motifs, among all possible collections. The first part utilizes a simplified version of the Sankoff algorithm for simultaneous folding and alignment of RNA sequences, but maintains tractability by constructing multi-sequence alignments from pairwise comparisons. The algorithm finds the multiple alignments using a greedy approach and has similarities to both CLUSTAL and CONSENSUS, but the core algorithm assures that the pairwise alignments are optimized for both sequence and structure conservation. The choice of scoring system and the method of progressively constructing the final solution are important considerations that are discussed. Example solutions, and comparisons with other approaches, are provided. The solutions include finding consensus structures identical to published ones.Keywords
This publication has 27 references indexed in Scilit:
- Identification of common molecular subsequencesPublished by Elsevier ,2004
- Description of RNA Folding by "Simulated Annealing">Journal of Molecular Biology, 1996
- CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choiceNucleic Acids Research, 1994
- Hidden Markov models of biological primary sequence information.Proceedings of the National Academy of Sciences, 1994
- Fast folding and comparison of RNA secondary structuresMonatshefte für Chemie / Chemical Monthly, 1994
- Hidden Markov Models in Computational BiologyJournal of Molecular Biology, 1994
- Systematic Evolution of Ligands by Exponential Enrichment: RNA Ligands to Bacteriophage T4 DNA PolymeraseScience, 1990
- A new algorithm for best subsequence alignments with application to tRNA-rRNA comparisonsJournal of Molecular Biology, 1987
- Optimal computer folding of large RNA sequences using thermodynamics and auxiliary informationNucleic Acids Research, 1981
- Fast algorithm for predicting the secondary structure of single-stranded RNA.Proceedings of the National Academy of Sciences, 1980