Creating efficient codebooks for visual recognition

Top Cited Papers

1 January 2005

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

Vol. 1 (15505499) , 604-610 Vol. 1
https://doi.org/10.1109/iccv.2005.66

Abstract

Visual codebook based quantization of robust appearance descriptors extracted from local image patches is an effective means of capturing image statistics for texture analysis and scene classification. Codebooks are usually constructed by using a method such as k-means to cluster the descriptor vectors of patches sampled either densely ('textons') or sparsely ('bags of features' based on key-points or salience measures) from a set of training images. This works well for texture analysis in homogeneous images, but the images that arise in natural object recognition tasks have far less uniform statistics. We show that for dense sampling, k-means over-adapts to this, clustering centres almost exclusively around the densest few regions in descriptor space and thus failing to code other informative regions. This gives suboptimal codes that are no better than using randomly selected centres. We describe a scalable acceptance-radius based clusterer that generates better codebooks and study its performance on several image classification tasks. We also show that dense representations outperform equivalent keypoint based ones on these tasks and that SVM or mutual information based feature selection starting from a dense codebook further improves the performance.

Keywords

This publication has 21 references indexed in Scilit:

A sparse texture representation using local affine regions
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Scale-invariant shape features for recognition of object categories
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2004
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision, 2004
A k-Median Algorithm with Running Time Independent of Data Size
Machine Learning, 2004
Face recognition: component-based versus global approaches
Computer Vision and Image Understanding, 2003
Interleaved Object Categorization and Segmentation
Published by British Machine Vision Association and Society for Pattern Recognition ,2003
Mean shift based clustering in high dimensions: a texture classification example
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2003
Representing and Recognizing the Visual Appearance of Materials using Three-dimensional Textons
International Journal of Computer Vision, 2001
Occlusion Models for Natural Images: A Statistical Study of a Scale-Invariant Dead Leaves Model
International Journal of Computer Vision, 2001
Textons, the elements of texture perception, and their interactions
Nature, 1981