Joint learning of visual attributes, object classes and visual saliency
- 1 September 2009
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- No. 15505499,p. 537-544
- https://doi.org/10.1109/iccv.2009.5459194
Abstract
We present a method to learn visual attributes (eg."red", "metal", "spotted") and object classes (eg. "car", "dress", "umbrella") together. We assume images are labeled with category, but not location, of an instance. We estimate models with an iterative procedure: the current model is used to produce a saliency score, which, together with a homogeneity cue, identifies likely locations for the object (resp. attribute); then those locations are used to produce better models with multiple instance learning. Crucially, the object and attribute models must agree on the potential locations of an object. This means that the more accurate of the two models can guide the improvement of the less accurate model. Our method is evaluated on two data sets of images of real scenes, one in which the attribute is color and the other in which it is material. We show that our joint learning produces improved detectors. We demonstrate generalization by detecting attribute-object pairs which do not appear in our training data. The iteration gives significant improvement in performance.Keywords
This publication has 12 references indexed in Scilit:
- Describing objects by their attributesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2009
- Randomized Clustering Forests for Image ClassificationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2008
- Representing shape with a spatial pyramid kernelPublished by Association for Computing Machinery (ACM) ,2007
- Learning Color Names from Real-World ImagesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2007
- Training linear SVMs in linear timePublished by Association for Computing Machinery (ACM) ,2006
- Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene CategoriesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2006
- Histograms of Oriented Gradients for Human DetectionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Learning to detect natural image boundaries using local brightness, color, and texture cuesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2004
- Object class recognition by unsupervised scale-invariant learningPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Representing and Recognizing the Visual Appearance of Materials using Three-dimensional TextonsInternational Journal of Computer Vision, 2001