Common phrases and minimum-space text storage
- 1 March 1973
- journal article
- Published by Association for Computing Machinery (ACM) in Communications of the ACM
- Vol. 16 (3) , 148-152
- https://doi.org/10.1145/361972.361982
Abstract
A method for saving storage space for text strings, such as compiler diagnostic messages, is described. The method relies on hand selection of a set of text strings which are common to one or more messages. These phrases are then stored only once. The storage technique gives rise to a mathematical optimization problem: determine how each message should use the available phrases to minimize its storage requirement. This problem is nontrivial when phrases which overlap exist. However, a dynamic programming algorithm is presented which solves the problem in time which grows linearly with the number of characters in the text. Algorithm 444 applies to this paper.Keywords
This publication has 3 references indexed in Scilit:
- Algorithm 444: an algorithm for extracting phrases in a space-optimal fashionCommunications of the ACM, 1973
- The quadratic quotient methodCommunications of the ACM, 1970
- PUFFT—The Purdue University fast FORTRAN translatorCommunications of the ACM, 1965