Combinatorial Compression and Partitioning of Large Dictionaries

Abstract
A method for compressing large dictionaries is proposed, based on transforming words into lexicographically ordered strings of distinct letters, together with permutation indexes. Algorithms to generate such strings are described. Results of applying the method to the dictionaries of two large databases, in Hebrew and English, are presented. The main message is a method of partitioning the dictionary such that the ‘information bearing fraction’ is stored in fast memory, and the bulk in auxiliary memory.

This publication has 0 references indexed in Scilit: