IPknot: fast and accurate prediction of RNA secondary structures with pseudoknots using integer programming
Open Access
- 14 June 2011
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 27 (13) , i85-i93
- https://doi.org/10.1093/bioinformatics/btr215
Abstract
Motivation: Pseudoknots found in secondary structures of a number of functional RNAs play various roles in biological processes. Recent methods for predicting RNA secondary structures cover certain classes of pseudoknotted structures, but only a few of them achieve satisfying predictions in terms of both speed and accuracy. Results: We propose IPknot, a novel computational method for predicting RNA secondary structures with pseudoknots based on maximizing expected accuracy of a predicted structure. IPknot decomposes a pseudoknotted structure into a set of pseudoknot-free substructures and approximates a base-pairing probability distribution that considers pseudoknots, leading to the capability of modeling a wide class of pseudoknots and running quite fast. In addition, we propose a heuristic algorithm for refining base-paring probabilities to improve the prediction accuracy of IPknot. The problem of maximizing expected accuracy is solved by using integer programming with threshold cut. We also extend IPknot so that it can predict the consensus secondary structure with pseudoknots when a multiple sequence alignment is given. IPknot is validated through extensive experiments on various datasets, showing that IPknot achieves better prediction accuracy and faster running time as compared with several competitive prediction methods. Availability: The program of IPknot is available at http://www.ncrna.org/software/ipknot/. IPknot is also available as a web server at http://rna.naist.jp/ipknot/. Contact:satoken@k.u-tokyo.ac.jp; ykato@is.naist.jp Supplementary information: Supplementary data are available at Bioinformatics online.Keywords
This publication has 50 references indexed in Scilit:
- Rfam: Wikipedia, clans and the "decimal" releaseNucleic Acids Research, 2010
- RactIP: fast and accurate prediction of RNA-RNA interaction using integer programmingBioinformatics, 2010
- CentroidAlign: fast and accurate aligner for structured RNAs by maximizing expected sum-of-pairs scoreBioinformatics, 2009
- A max-margin model for efficient simultaneous alignment and folding of RNA sequencesBioinformatics, 2008
- Centroid estimation in discrete high-dimensional spaces with applications in biologyProceedings of the National Academy of Sciences, 2008
- High sensitivity RNA pseudoknot predictionNucleic Acids Research, 2006
- Predicting RNA pseudoknot folding thermodynamicsNucleic Acids Research, 2006
- Non–coding RNA genes and the modern RNA worldNature Reviews Genetics, 2001
- CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choiceNucleic Acids Research, 1994
- The equilibrium partition function and base pair binding probabilities for RNA secondary structureBiopolymers, 1990