RNA secondary structure prediction from sequence alignments using a network ofk-nearest neighbor classifiers
Open Access
- 21 February 2006
- journal article
- Published by Cold Spring Harbor Laboratory in RNA
- Vol. 12 (3) , 342-352
- https://doi.org/10.1261/rna.2164906
Abstract
We present a machine learning method (a hierarchical network ofk-nearest neighbor classifiers) that uses an RNA sequence alignment in order to predict a consensus RNA secondary structure. The input to the network is the mutual information, the fraction of complementary nucleotides, and a novel consensus RNAfold secondary structure prediction of a pair of alignment columns and its nearest neighbors. Given this input, the network computes a prediction as to whether a particular pair of alignment columns corresponds to a base pair. By using a comprehensive test set of 49 RFAM alignments, the program KNetFold achieves an average Matthews correlation coefficient of 0.81. This is a significant improvement compared with the secondary structure prediction methods PFOLD and RNAalifold. By using the example of archaeal RNase P, we show that the program can also predict pseudoknot interactions.Keywords
This publication has 45 references indexed in Scilit:
- Secondary Structure Prediction for Aligned RNA SequencesJournal of Molecular Biology, 2002
- RNA folding pathway functional intermediates: their prediction and analysis 1 1Edited by I. TinocoJournal of Molecular Biology, 2001
- Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structureJournal of Molecular Biology, 1999
- A dynamic programming algorithm for RNA structure prediction including pseudoknots 1 1Edited by I. TinocoJournal of Molecular Biology, 1999
- STRUCTURELAB: A heterogeneous bioinformatics system for RNA structure analysisJournal of Molecular Graphics, 1996
- A massively parallel genetic algorithm for RNA secondary structure predictionThe Journal of Supercomputing, 1994
- Features of spliceosome evolution and function inferred from an analysis of the information at human splice sitesJournal of Molecular Biology, 1992
- The equilibrium partition function and base pair binding probabilities for RNA secondary structureBiopolymers, 1990
- Information content of binding sites on nucleotide sequencesJournal of Molecular Biology, 1986
- RNA secondary structure: a complete mathematical analysisMathematical Biosciences, 1978