Abstract
Analysis is made of the effect of using an efficient code for compression of terms within a document data base. The storage efficiency is expressed in terms of the vocabulary length and the values of certain parameters which describe the structure of the code. For vocabularies of up to 100,000 terms the average code length is approximately twelve bits. No information is lost through term truncation or abbreviation. The tables required for coding and decoding may be ordered for rapid access without reduction in the ease of update.