Evaluation of the suitability of free-energy minimization using nearest-neighbor energy parameters for RNA secondary structure prediction
Open Access
- 5 August 2004
- journal article
- research article
- Published by Springer Nature in BMC Bioinformatics
- Vol. 5 (1) , 105
- https://doi.org/10.1186/1471-2105-5-105
Abstract
Background: A detailed understanding of an RNA's correct secondary and tertiary structure is crucial to understanding its function and mechanism in the cell. Free energy minimization with energy parameters based on the nearest-neighbor model and comparative analysis are the primary methods for predicting an RNA's secondary structure from its sequence. Version 3.1 of Mfold has been available since 1999. This version contains an expanded sequence dependence of energy parameters and the ability to incorporate coaxial stacking into free energy calculations. We test Mfold 3.1 by performing the largest and most phylogenetically diverse comparison of rRNA and tRNA structures predicted by comparative analysis and Mfold, and we use the results of our tests on 16S and 23S rRNA sequences to assess the improvement between Mfold 2.3 and Mfold 3.1. Results: The average prediction accuracy for a 16S or 23S rRNA sequence with Mfold 3.1 is 41%, while the prediction accuracies for the majority of 16S and 23S rRNA structures tested are between 20% and 60%, with some having less than 20% prediction accuracy. The average prediction accuracy was 71% for 5S rRNA and 69% for tRNA. The majority of the 5S rRNA and tRNA sequences have prediction accuracies greater than 60%. The prediction accuracy of 16S rRNA base-pairs decreases exponentially as the number of nucleotides intervening between the 5' and 3' halves of the base-pair increases. Conclusion: Our analysis indicates that the current set of nearest-neighbor energy parameters in conjunction with the Mfold folding algorithm are unable to consistently and reliably predict an RNA's correct secondary structure. For 16S or 23S rRNA structure prediction, Mfold 3.1 offers little improvement over Mfold 2.3. However, the nearest-neighbor energy parameters do work well for shorter RNA sequences such as tRNA or 5S rRNA, or for larger rRNAs when the contact distance between the base-pairs is less than 100 nucleotides.Keywords
This publication has 49 references indexed in Scilit:
- Dynalign: an algorithm for finding the secondary structure common to two RNA sequencesJournal of Molecular Biology, 2002
- Thermodynamics of Three-Way Multibranch Loops in RNABiochemistry, 2001
- Assessment of novel fold targets in CASP4: Predictions of three-dimensional structures, secondary structures, and interresidue contactsProteins-Structure Function and Bioinformatics, 2001
- Assessment of the CASP4 fold recognition categoryProteins-Structure Function and Bioinformatics, 2001
- A story: unpaired adenosine bases in ribosomal RNAsJournal of Molecular Biology, 2000
- The Complete Atomic Structure of the Large Ribosomal Subunit at 2.4 Å ResolutionScience, 2000
- Contact order, transition state placement and the refolding rates of single domain proteins 1 1Edited by P. E. WrightJournal of Molecular Biology, 1998
- Stability of ribonucleic acid double-stranded helicesJournal of Molecular Biology, 1974
- Free energy of imperfect nucleic acid helices: II. Small Hairpin LoopsJournal of Molecular Biology, 1973
- Stability of RNA hairpin loops: A6-Cm-U6Journal of Molecular Biology, 1973