Accelerated probabilistic inference of RNA structure evolution
Open Access
- 24 March 2005
- journal article
- Published by Springer Nature in BMC Bioinformatics
- Vol. 6 (1) , 73
- https://doi.org/10.1186/1471-2105-6-73
Abstract
Background: Pairwise stochastic context-free grammars (Pair SCFGs) are powerful tools for evolutionary analysis of RNA, including simultaneous RNA sequence alignment and secondary structure prediction, but the associated algorithms are intensive in both CPU and memory usage. The same problem is faced by other RNA alignment-and-folding algorithms based on Sankoff's 1985 algorithm. It is therefore desirable to constrain such algorithms, by pre-processing the sequences and using this first pass to limit the range of structures and/or alignments that can be considered. Results: We demonstrate how flexible classes of constraint can be imposed, greatly reducing the computational costs while maintaining a high quality of structural homology prediction. Any score-attributed context-free grammar (e.g. energy-based scoring schemes, or conditionally normalized Pair SCFGs) is amenable to this treatment. It is now possible to combine independent structural and alignment constraints of unprecedented general flexibility in Pair SCFG alignment algorithms. We outline several applications to the bioinformatics of RNA sequence and structure, including Waterman-Eggert N-best alignments and progressive multiple alignment. We evaluate the performance of the algorithm on test examples from the RFAM database. Conclusion: A program, Stemloc, that implements these algorithms for efficient RNA sequence alignment and structure prediction is available under the GNU General Public License.Keywords
This publication has 36 references indexed in Scilit:
- Amino acid substitution matrices from an information theoretic perspectivePublished by Elsevier ,2005
- A probabilistic model for the evolution of RNA structureBMC Bioinformatics, 2004
- RNA secondary structure prediction using stochastic context-free grammars and evolutionary history.Bioinformatics, 1999
- Dynamic Programming Alignment AccuracyJournal of Computational Biology, 1998
- Finding the most significant common sequence and structure motifs in a set of RNA sequencesNucleic Acids Research, 1997
- RNA pseudoknot modeling using intersections of stochastic context free grammars with applications to database search.1996
- RNA sequence analysis using covariance modelsNucleic Acids Research, 1994
- Stochastic context-free grammers for tRNA modelingNucleic Acids Research, 1994
- The equilibrium partition function and base pair binding probabilities for RNA secondary structureBiopolymers, 1990
- Fast and sensitive multiple sequence alignments on a microcomputerBioinformatics, 1989