Extended application of suffix trees to data compression
- 23 December 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- No. 10680314,p. 190-199
- https://doi.org/10.1109/dcc.1996.488324
Abstract
A practical scheme for maintaining an index for a sliding window in optimal time and space, by use of a suffix tree, is presented. The index supports location of the longest matching substring in time proportional to the length of the match. The total time for build and update operations is proportional to the size of the input. The algorithm, which is simple and straightforward, is presented in detail. The most prominent lossless data compression scheme, when considering compression performance, is prediction by partial matching with unbounded context lengths (PPM). However, previously presented algorithms are hardly practical, considering their extensive use of computational resources. We show that our scheme can be applied to PPM-style compression, obtaining an algorithm that runs in linear time, and in space bounded by an arbitrarily chosen window size. Application to Ziv-Lempel (1977) compression methods is straightforward and the resulting algorithm runs in linear time.Keywords
This publication has 13 references indexed in Scilit:
- Arithmetic coding revisitedPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Unbounded length contexts for PPMPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- On-line construction of suffix treesAlgorithmica, 1995
- The zero-frequency problem: estimating the probabilities of novel events in adaptive text compressionIEEE Transactions on Information Theory, 1991
- Implementing the PPM data compression schemeIEEE Transactions on Communications, 1990
- Data compression with finite windowsCommunications of the ACM, 1989
- Arithmetic coding for data compressionCommunications of the ACM, 1987
- Data Compression Using Adaptive Coding and Partial String MatchingIEEE Transactions on Communications, 1984
- A universal algorithm for sequential data compressionIEEE Transactions on Information Theory, 1977
- A Space-Economical Suffix Tree Construction AlgorithmJournal of the ACM, 1976