Classification with finite memory

1 March 1996

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Information Theory

Vol. 42 (2) , 337-347
https://doi.org/10.1109/18.485707

Abstract

Consider the following situation. A device called a classifier observes a probability law P on l-vectors from an alphabet of size A. Its task is to observe a second probability law Q and decide whether P≡Q or P and Q are sufficiently different according to some appropriate criterion. If the classifier has available an unlimited memory (so that it can remember P(z) exactly for all z), this is a simple matter. In fact for most differentness criteria, a finite memory of 2(log A)l+o(l) bits will suffice (for large l), i.e., store a finite approximation of P(z) for all Alz's. In a sense made precise in this paper, it is shown that a memory of only about 2Rl bits is required, where the quantity R<log A, and is closely related to the entropy of P. Further, it is shown that if instead of being given P(z), for all z, the classifier is given a training sequence drawn with a probability law P that can be stored using about 2Rl bits, then correct classification is also possible

Keywords

This publication has 2 references indexed in Scilit:

Some asymptotic properties of the entropy of a stationary ergodic data source with applications to data compression
IEEE Transactions on Information Theory, 1989
Lower bounds for constant weight codes
IEEE Transactions on Information Theory, 1980