Universal Data Compression Algorithm Based on Approximate String Matching
- 1 October 1996
- journal article
- research article
- Published by Cambridge University Press (CUP) in Probability in the Engineering and Informational Sciences
- Vol. 10 (4) , 465-486
- https://doi.org/10.1017/s0269964800004502
Abstract
A practical source coding scheme based on approximate string matching is proposed. It is an approximate fixed-length string matching data compression combined with a block-coder based on the empirical distribution. A lemma on approximate string matching, which is an extension of the Kac Lemma, is proved. It is shown, based on the lemma, that the deterministic algorithm converts the stationary and ergodic source, u, into an output process v, and under the assumption that v is a stationary process, after the scheme has run for an infinite time, the optimal compression ratio R(D) is achieved. This reduces the problem of the universal lossy coder to the proof of stationarity of the output process ν in the proposed algorithm. The main advantages of the proposed method are the asymptotic sequential behavior of the encoder and the simplicity of the decoder.Keywords
This publication has 27 references indexed in Scilit:
- Operational rate distortion theoryPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- An algorithm for source coding subject to a fidelity criterion, based on string matchingIEEE Transactions on Information Theory, 1993
- Entropy and data compression schemesIEEE Transactions on Information Theory, 1993
- Upper bounds on the probability of sequences emitted by finite-state sources and on the redundancy of the Lempel-Ziv algorithmIEEE Transactions on Information Theory, 1992
- The Erdos-Renyi Law in Distribution, for Coin Tossing and Sequence MatchingThe Annals of Statistics, 1990
- Entropy and Information TheoryPublished by Springer Nature ,1990
- Lossy On-Line Dynamic Data CompressionPublished by Springer Nature ,1990
- Data structures and algorithms for approximate string matchingJournal of Complexity, 1988
- The Individual Ergodic Theorem of Information TheoryThe Annals of Mathematical Statistics, 1957
- On the notion of recurrence in discrete stochastic processesBulletin of the American Mathematical Society, 1947