Abstract
The most frequently used words in natural or printed English are found unexpectedly to contain only an average proportion of the most frequently used letters. This independence of the word and letter frequency distributions is used to minimise the number of bits necessary to code natural English text. It is shown that mean bit rates of less than 4 per character can be achieved for text using the full ASCII set of 96 characters, by combining a variable bit length representation of each character with a character combination dictionary of a 100 or more common words. A simple practical scheme is presented which uses, 4, 8 or 12 bits to code the characters and dictionary words. Using this scheme with a 205 word dictionary, a mean code rate of 3.87 bits per character is achieved. It is indicated how even this rate might be improved with a larger dictionary or by basing the dictionary on the more numerous word prefixes.

This publication has 0 references indexed in Scilit: