Fisher Kernels on Visual Vocabularies for Image Categorization
Top Cited Papers
- 1 June 2007
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- No. 10636919,p. 1-8
- https://doi.org/10.1109/cvpr.2007.383266
Abstract
Within the field of pattern classification, the Fisher kernel is a powerful framework which combines the strengths of generative and discriminative approaches. The idea is to characterize a signal with a gradient vector derived from a generative probability model and to subsequently feed this representation to a discriminative classifier. We propose to apply this framework to image categorization where the input signals are images and where the underlying generative model is a visual vocabulary: a Gaussian mixture model which approximates the distribution of low-level features in images. We show that Fisher kernels can actually be understood as an extension of the popular bag-of-visterms. Our approach demonstrates excellent performance on two challenging databases: an in-house database of 19 object/scene categories and the recently released VOC 2006 database. It is also very practical: it has low computational needs both at training and test time and vocabularies trained on one set of categories can be applied to another set without any significant loss in performance.Keywords
This publication has 9 references indexed in Scilit:
- Adapted Vocabularies for Generic Visual CategorizationPublished by Springer Nature ,2006
- Constructing Visual Models with a Latent Space ApproachPublished by Springer Nature ,2006
- Sparse multinomial logistic regression: fast algorithms and generalization boundsIEEE Transactions on Pattern Analysis and Machine Intelligence, 2005
- Speaker verification using sequence discriminant support vector machinesIEEE Transactions on Speech and Audio Processing, 2005
- Object categorization by learned universal visual dictionaryPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Creating efficient codebooks for visual recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Combining generative models and Fisher kernels for object recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Distinctive Image Features from Scale-Invariant KeypointsInternational Journal of Computer Vision, 2004
- Video Google: a text retrieval approach to object matching in videosPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003