Incremental String Comparison
- 1 April 1998
- journal article
- Published by Society for Industrial & Applied Mathematics (SIAM) in SIAM Journal on Computing
- Vol. 27 (2) , 557-582
- https://doi.org/10.1137/s0097539794264810
Abstract
The problem of comparing two sequences A and B to determine their longest common subsequence (LCS) or the edit distance between them has been much studied. In this paper we consider the following incremental version of these problems: given an appropriate encoding of a comparison between A and B, can one incrementally compute the answer for A and bB, and the answer for A and Bb with equal efficiency, where b is an additional symbol? Our main result is a theorem exposing a surprising relationship between the dynamic programming solutions for two such "adjacent" problems. Given a threshold k on the number of differences to be permitted in an alignment, the theorem leads directly to an O(k) algorithm for incrementally computing a new solution from an old one, as contrasts the O(k2) time required to compute a solution from scratch. We further show, with a series of applications, that this algorithm is indeed more powerful than its nonincremental counterpart. We show this by solving the applications with greater asymptotic efficiency than heretofore possible. For example, we obtain O(nk) algorithms for the longest prefix approximate match problem, the approximate overlap problem, and cyclic string comparison.Keywords
This publication has 23 references indexed in Scilit:
- Recursive Star-Tree Parallel Data StructureSIAM Journal on Computing, 1993
- An efficient algorithm for the All Pairs Suffix-Prefix ProblemInformation Processing Letters, 1992
- An Improved Algorithm For Approximate String MatchingSIAM Journal on Computing, 1990
- On a cyclic string-to-string correction problemInformation Processing Letters, 1990
- Fast parallel and serial approximate string matchingJournal of Algorithms, 1989
- Fast string matching with k differencesJournal of Computer and System Sciences, 1988
- Fast Algorithms for Finding Nearest Common AncestorsSIAM Journal on Computing, 1984
- Approximate String MatchingACM Computing Surveys, 1980
- Algorithms for the Longest Common Subsequence ProblemJournal of the ACM, 1977
- A fast algorithm for computing longest common subsequencesCommunications of the ACM, 1977