Off-line dictionary-based compression
Top Cited Papers
- 1 November 2000
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in Proceedings of the IEEE
- Vol. 88 (11) , 1722-1732
- https://doi.org/10.1109/5.892708
Abstract
Dictionary-based modeling is a mechanism used in many practical compression schemes. In most implementations of dictionary-based compression the encoder operates on-line, incrementally inferring its dictionary of available phrases from previous parts of the message. An alternative approach is to use the full message to infer a complete dictionary in advance, and include an explicit representation of the dictionary as part of the compressed message. In this investigation, we develop a compression scheme that is a combination of a simple but powerful phrase derivation method and a compact dictionary encoding. The scheme is highly efficient, particularly in decompression, and has characteristics that make it a favorable choice when compressed data is to be searched directly. We describe data structures and algorithms that allow our mechanism to operate in linear time and space.Keywords
This publication has 16 references indexed in Scilit:
- A hybrid approach to text compressionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Off-line compression by greedy textual substitutionProceedings of the IEEE, 2000
- Binary Interpolative Coding for Effective Index CompressionInformation Retrieval Journal, 2000
- Data compression using long common stringsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1999
- A text compression scheme that allows fast searching directly in the compressed fileACM Transactions on Information Systems, 1997
- Compression and Explanation using Hierarchical GrammarsThe Computer Journal, 1997
- The relationship between greedy parsing and symbolwise text compressionJournal of the ACM, 1994
- A note on the Ziv - Lempel model for compressing individual sequences (Corresp.)IEEE Transactions on Information Theory, 1983
- Recoding of natural language for economy of transmission or storageThe Computer Journal, 1978
- AN ALGORITHM FOR THE SEGMENTATION OF AN ARTIFICIAL LANGUAGE ANALOGUEBritish Journal of Psychology, 1975