Efficient Algorithms for Probing the RNA Mutation Landscape
Open Access
- 8 August 2008
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Computational Biology
- Vol. 4 (8) , e1000124
- https://doi.org/10.1371/journal.pcbi.1000124
Abstract
The diversity and importance of the role played by RNAs in the regulation and development of the cell are now well-known and well-documented. This broad range of functions is achieved through specific structures that have been (presumably) optimized through evolution. State-of-the-art methods, such as McCaskill's algorithm, use a statistical mechanics framework based on the computation of the partition function over the canonical ensemble of all possible secondary structures on a given sequence. Although secondary structure predictions from thermodynamics-based algorithms are not as accurate as methods employing comparative genomics, the former methods are the only available tools to investigate novel RNAs, such as the many RNAs of unknown function recently reported by the ENCODE consortium. In this paper, we generalize the McCaskill partition function algorithm to sum over the grand canonical ensemble of all secondary structures of all mutants of the given sequence. Specifically, our new program, RNAmutants, simultaneously computes for each integer k the minimum free energy structure MFE(k) and the partition function Z(k) over all secondary structures of all k-point mutants, even allowing the user to specify certain positions required not to mutate and certain positions required to base-pair or remain unpaired. This technically important extension allows us to study the resilience of an RNA molecule to pointwise mutations. By computing the mutation profile of a sequence, a novel graphical representation of the mutational tendency of nucleotide positions, we analyze the deleterious nature of mutating specific nucleotide positions or groups of positions. We have successfully applied RNAmutants to investigate deleterious mutations (mutations that radically modify the secondary structure) in the Hepatitis C virus cis-acting replication element and to evaluate the evolutionary pressure applied on different regions of the HIV trans-activation response element. In particular, we show qualitative agreement between published Hepatitis C and HIV experimental mutagenesis studies and our analysis of deleterious mutations using RNAmutants. Our work also predicts other deleterious mutations, which could be verified experimentally. Finally, we provide evidence that the 3′ UTR of the GB RNA virus C has been optimized to preserve evolutionarily conserved stem regions from a deleterious effect of pointwise mutations. We hope that there will be long-term potential applications of RNAmutants in de novo RNA design and drug design against RNA viruses. This work also suggests potential applications for large-scale exploration of the RNA sequence-structure network. Binary distributions are available at http://RNAmutants.csail.mit.edu/. Evolution is a central concept in biology. This phenomenon can be observed at all levels of the organization of life—from single molecules to multicellular organisms. Here, we focus our attention on the implication of evolution at the level of nucleic acid sequences. In this context, RNA sequences presumably have been optimized by evolution to achieve specific functions. These functions are supported by a structure that can be determined using thermodynamics-based models and energy minimization techniques. In this work, we develop efficient algorithms for predicting energetically favorable mutations and study their impact on the stability of the structure. We use these techniques to reveal sequences under evolutionary pressure and design new methods to predict lethal mutations. Applications of our tool lead to a better understanding of the mutational process of some key regulatory elements of two important pathogenic RNA viruses—human immunodeficiency virus and hepatitis C virus.Keywords
This publication has 54 references indexed in Scilit:
- SimulFold: Simultaneously Inferring RNA Structures Including Pseudoknots, Alignments, and Trees Using a Bayesian MCMC FrameworkPLoS Computational Biology, 2007
- Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot projectNature, 2007
- Efficient pairwise RNA structure prediction using probabilistic alignment constraints in DynalignBMC Bioinformatics, 2007
- Computing the Partition Function and Sampling for Saturated Secondary Structures of RNA, with Respect to the Turner Energy ModelJournal of Computational Biology, 2007
- Identification and Classification of Conserved RNA Secondary Structures in the Human GenomePLoS Computational Biology, 2006
- MicroRNAs and Other Tiny Endogenous RNAs in C. elegansCurrent Biology, 2003
- Dynalign: an algorithm for finding the secondary structure common to two RNA sequencesJournal of Molecular Biology, 2002
- Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structureJournal of Molecular Biology, 1999
- Analysis of RNA sequence structure maps by exhaustive enumeration I. Neutral networksMonatshefte für Chemie / Chemical Monthly, 1996
- The equilibrium partition function and base pair binding probabilities for RNA secondary structureBiopolymers, 1990