Faster approximate string matching over compressed text
- 13 November 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 459-468
- https://doi.org/10.1109/dcc.2001.917177
Abstract
Approximate string matching on compressed text was an open problem for almost a decade. The two existing solutions are very new. Despite that they represent important complexity breakthroughs, in most practical cases they are not useful, in the sense that they are slower than uncompressing the text and then searching the uncompressed text. We present a different approach, which reduces the problem to multipattern searching of pattern pieces plus local decompression and direct verification of candidate text areas. We show experimentally that this solution is 10-30 times faster than previous work and up to three times faster than the trivial approach of uncompressing and searching, thus becoming the first practical solution to the problem.Keywords
This publication has 17 references indexed in Scilit:
- Multiple pattern matching in LZW compressed textPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Bit-parallel approach to approximate string matching in compressed textsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Fast and flexible word searching on compressed textACM Transactions on Information Systems, 2000
- Approximate String Matching over Ziv—Lempel Compressed TextPublished by Springer Nature ,2000
- Very fast and simple approximate string matchingInformation Processing Letters, 1999
- A text compression scheme that allows fast searching directly in the compressed fileACM Transactions on Information Systems, 1997
- Combinatorial Pattern MatchingPublished by Springer Nature ,1997
- A fast string searching algorithmCommunications of the ACM, 1977
- Fast Pattern Matching in StringsSIAM Journal on Computing, 1977
- Efficient string matchingCommunications of the ACM, 1975