SCARNA: fast and accurate structural alignment of RNA sequences by matching fixed-length stem fragments
Open Access
- 11 May 2006
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 22 (14) , 1723-1729
- https://doi.org/10.1093/bioinformatics/btl177
Abstract
Motivation: The functions of non-coding RNAs are strongly related to their secondary structures, but it is known that a secondary structure prediction of a single sequence is not reliable. Therefore, we have to collect similar RNA sequences with a common secondary structure for the analyses of a new non-coding RNA without knowing the exact secondary structure itself. Therefore, the sequence comparison in searching similar RNAs should consider not only their sequence similarities but also their potential secondary structures. Sankoff's algorithm predicts the common secondary structures of the sequences, but it is computationally too expensive to apply to large-scale analyses. Because we often want to compare a large number of cDNA sequences or to search similar RNAs in the whole genome sequences, much faster algorithms are required. Results: We propose a new method of comparing RNA sequences based on the structural alignments of the fixed-length fragments of the stem candidates. The implemented software, SCARNA (Stem Candidate Aligner for RNAs), is fast enough to apply to the long sequences in the large-scale analyses. The accuracy of the alignments is better or comparable with the much slower existing algorithms. Availability: The web server of SCARNA with graphical structural alignment viewer is available at Contact:scarna@m.aist.go.jp Supplementary information: The data and the supplementary information are available at .Keywords
This publication has 28 references indexed in Scilit:
- Predicting a set of minimal free energy RNA secondary structures common to two sequencesBioinformatics, 2005
- Consensus Folding of Aligned Sequences as a New Measure for the Detection of Functional RNAs by Comparative GenomicsJournal of Molecular Biology, 2004
- MUSCLE: multiple sequence alignment with high accuracy and high throughputNucleic Acids Research, 2004
- A graph theoretical approach for predicting common RNA secondary structure motifs including pseudoknots in unaligned sequencesBioinformatics, 2004
- Multiple sequence alignment with the Clustal series of programsNucleic Acids Research, 2003
- Secondary Structure Prediction for Aligned RNA SequencesJournal of Molecular Biology, 2002
- Dynalign: an algorithm for finding the secondary structure common to two RNA sequencesJournal of Molecular Biology, 2002
- Non–coding RNA genes and the modern RNA worldNature Reviews Genetics, 2001
- Significant Improvement in Accuracy of Multiple Protein Sequence Alignments by Iterative Refinement as Assessed by Reference to Structural AlignmentsJournal of Molecular Biology, 1996
- The equilibrium partition function and base pair binding probabilities for RNA secondary structureBiopolymers, 1990