Abstract
Most color indexing techniques proposed in the literature are similar: images are represented by color histograms, and a metric on the color histogram space is used to determine the similarity of images. In this paper we determine the limits of these color indexing techniques. We propose two functions to measure the discrimination power of indexing techniques: the capacity (how many distinguishable histograms can be stored) and the maximal match number (the maximal number of retrieved images). We derive bounds for these functions. These bounds have two practical aspects. First, they help a user to decide whether color histograms effectively index database images from a given domain. Second, they facilitate the choice of a good threshold for the distance below which histograms are considered similar. Our arguments are based on an analysis of the metrical properties of the histogram space and results from coding theory. The results show that over a large range of reasonable parameters the capacity is very large. Thus, the set of parameters for which color indexing works well can be described as the set of parameters for which the maximal match number is below an application-dependent maximum.

This publication has 0 references indexed in Scilit: