Pair stochastic tree adjoining grammars for aligning and predicting pseudoknot RNA structures
Open Access
- 15 March 2005
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 21 (11) , 2611-2617
- https://doi.org/10.1093/bioinformatics/bti385
Abstract
Motivation: Since the whole genome sequences of many species have been determined, computational prediction of RNA secondary structures and computational identification of those non-coding RNA regions by comparative genomics become important. Therefore, more advanced alignment methods are required. Recently, an approach of structural alignment for RNA sequences has been introduced to solve these problems. Pair hidden Markov models on tree structures (PHMMTSs) proposed by Sakakibara are efficient automata-theoretic models for structural alignment of RNA secondary structures, although PHMMTSs are incapable of handling pseudoknots. On the other hand, tree adjoining grammars (TAGs), a subclass of context-sensitive grammars, are suitable for modeling pseudoknots. Our goal is to extend PHMMTSs by incorporating TAGs to be able to handle pseudoknots. Results: We propose pair stochastic TAGs (PSTAGs) for aligning and predicting RNA secondary structures including a simple type of pseudoknot which can represent most known pseudoknot structures. First, we extend PHMMTSs defined on alignment of ‘trees’ to PSTAGs defined on alignment of ‘TAG trees’ which represent derivation processes of TAGs and are functionally equivalent to derived trees of TAGs. Then, we develop an efficient dynamic programming algorithm of PSTAGs for obtaining an optimal structural alignment including pseudoknots. We implement the PSTAG algorithm and demonstrate the properties of the algorithm by using it to align and predict several small pseudoknot structures. We believe that our implemented program based on PSTAGs is the first grammar-based and practically executable software for comparative analyses of RNA pseudoknot structures, and, further, non-coding RNAs. Availability: The source code of PSTAG and its web application are available at http://phmmts.dna.bio.keio.ac.jp/pstag/ Contact:yasu@bio.keio.ac.jpKeywords
This publication has 16 references indexed in Scilit:
- An Iterated loop matching approach to the prediction of RNA secondary structures with pseudoknotsBioinformatics, 2004
- A dynamic programming algorithm for RNA structure prediction including pseudoknots 1 1Edited by I. TinocoJournal of Molecular Biology, 1999
- Tree adjoining grammars for RNA structure predictionTheoretical Computer Science, 1999
- The Computer Simulation of RNA Folding Pathways Using a Genetic AlgorithmJournal of Molecular Biology, 1995
- AN APL-PROGRAMMED GENETIC ALGORITHM FOR THE PREDICTION OF RNA SECONDARY STRUCTUREJournal of Theoretical Biology, 1995
- CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choiceNucleic Acids Research, 1994
- Structural and functional aspects of RNA pseudoknotsBiochemistry, 1992
- Prediction of RNA secondary structure, including pseudoknotting, by computer simulationNucleic Acids Research, 1990
- Five pseudoknots are present at the 204 nucleotides long 3' noncoding region of tobacco mosak virus RNANucleic Acids Research, 1985
- Tree adjunct grammarsJournal of Computer and System Sciences, 1975