Frequency and isostericity of RNA base pairs
Open Access
- 24 February 2009
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 37 (7) , 2294-2312
- https://doi.org/10.1093/nar/gkp011
Abstract
Most of the hairpin, internal and junction loops that appear single-stranded in standard RNA secondary structures form recurrent 3D motifs, where non-Watson–Crick base pairs play a central role. Non-Watson–Crick base pairs also play crucial roles in tertiary contacts in structured RNA molecules. We previously classified RNA base pairs geometrically so as to group together those base pairs that are structurally similar (isosteric) and therefore able to substitute for each other by mutation without disrupting the 3D structure. Here, we introduce a quantitative measure of base pair isostericity, the IsoDiscrepancy Index (IDI), to more accurately determine which base pair substitutions can potentially occur in conserved motifs. We extract and classify base pairs from a reduced-redundancy set of RNA 3D structures from the Protein Data Bank (PDB) and calculate centroids (exemplars) for each base combination and geometric base pair type (family). We use the exemplars and IDI values to update our online Basepair Catalog and the Isostericity Matrices (IM) for each base pair family. From the database of base pairs observed in 3D structures we derive base pair occurrence frequencies for each of the 12 geometric base pair families. In order to improve the statistics from the 3D structures, we also derive base pair occurrence frequencies from rRNA sequence alignments.Keywords
This publication has 24 references indexed in Scilit:
- FR3D: finding local and composite recurrent structural motifs in RNA 3D structuresJournal of Mathematical Biology, 2007
- Structures of the Bacterial Ribosome at 3.5 Å ResolutionScience, 2005
- Large Macromolecular Complexes in the Protein Data Bank: A Status ReportStructure, 2005
- Evolutionary profiles from the QR factorization of multiple sequence alignmentsProceedings of the National Academy of Sciences, 2005
- The RCSB Protein Data Bank: a redesigned query system and relational database based on the mmCIF schemaNucleic Acids Research, 2004
- The European ribosomal RNA databaseNucleic Acids Research, 2004
- Rfam: an RNA family databaseNucleic Acids Research, 2003
- The non-Watson-Crick base pairs and their associated isostericity matricesNucleic Acids Research, 2002
- Quantitative analysis of nucleic acid three-dimensional structuresJournal of Molecular Biology, 2001
- Geometric nomenclature and classification of RNA base pairsRNA, 2001