A General Edit Distance between RNA Structures
- 1 April 2002
- journal article
- research article
- Published by Mary Ann Liebert Inc in Journal of Computational Biology
- Vol. 9 (2) , 371-388
- https://doi.org/10.1089/10665270252935511
Abstract
Arc-annotated sequences are useful in representing the structural information of RNA sequences. In general, RNA secondary and tertiary structures can be represented as a set of nested arcs and a set of crossing arcs, respectively. Since RNA functions are largely determined by molecular confirmation and therefore secondary and tertiary structures, the comparison between RNA secondary and tertiary structures has received much attention recently. In this paper, we propose the notion of edit distance to measure the similarity between two RNA secondary and tertiary structures, by incorporating various edit operations performed on both bases and arcs (i.e., base-pairs). Several algorithms are presented to compute the edit distance between two RNA sequences with various arc structures and under various score schemes, either exactly or approximately, with provably good performance. Preliminary experimental tests confirm that our definition of edit distance and the computation model are among the most reasonable ones ever studied in the literature.Keywords
This publication has 10 references indexed in Scilit:
- Computing similarity between RNA secondary structuresPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- The Ribonuclease P DatabaseNucleic Acids Research, 1999
- A Constrained Edit Distance Between Unordered Labeled TreesAlgorithmica, 1996
- Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programmingJournal of the ACM, 1995
- Exact and approximate algorithms for unordered tree matchingIEEE Transactions on Systems, Man, and Cybernetics, 1994
- On the editing distance between unordered labeled treesInformation Processing Letters, 1992
- Optimization, approximation, and complexity classesJournal of Computer and System Sciences, 1991
- Fast algorithms for the unit cost editing distance between treesJournal of Algorithms, 1990
- On Finding All Suboptimal Foldings of an RNA MoleculeScience, 1989
- Simultaneous Solution of the RNA Folding, Alignment and Protosequence ProblemsSIAM Journal on Applied Mathematics, 1985