OBJCUT: Efficient Segmentation Using Top-Down and Bottom-Up Cues

23 January 2009

journal article
research article
Published by Institute of Electrical and Electronics Engineers (IEEE)

Vol. 32 (3) , 530-545
https://doi.org/10.1109/tpami.2009.16

Abstract

We present a probabilistic method for segmenting instances of a particular object category within an image. Our approach overcomes the deficiencies of previous segmentation techniques based on traditional grid conditional random fields (CRF), namely that 1) they require the user to provide seed pixels for the foreground and the background and 2) they provide a poor prior for specific shapes due to the small neighborhood size of grid CRF. Specifically, we automatically obtain the pose of the object in a given image instead of relying on manual interaction. Furthermore, we employ a probabilistic model which includes shape potentials for the object to incorporate top-down information that is global across the image, in addition to the grid clique potentials which provide the bottom-up information used in previous approaches. The shape potentials are provided by the pose of the object obtained using an object category model. We represent articulated object categories using a novel layered pictorial structures model. Nonarticulated object categories are modeled using a set of exemplars. These object category models have the advantage that they can handle large intraclass shape, appearance, and spatial variation. We develop an efficient method, OBJCUT, to obtain segmentations using our probabilistic framework. Novel aspects of this method include: 1) efficient algorithms for sampling the object category models of our choice and 2) the observation that a sampling-based approximation of the expected log-likelihood of the model can be increased by a single graph cut. Results are presented on several articulated (e.g., animals) and nonarticulated (e.g., fruits) object categories. We provide a favorable comparison of our method with the state of the art in object category specific image segmentation, specifically the methods of Leibe and Schiele and Schoenemann and Cremers.

Keywords

This publication has 36 references indexed in Scilit:

Learning Layered Motion Segmentations of Video
International Journal of Computer Vision, 2007
TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation
Published by Springer Nature ,2006
Learning Class-Specific Edges for Object Detection and Segmentation
Published by Springer Nature ,2006
OBJCUT for Face Detection
Published by Springer Nature ,2006
A Multiphase Dynamic Labeling Model for Variational Recognition-driven Image Segmentation
International Journal of Computer Vision, 2006
Redundant Bit Vectors for Quickly Searching High-Dimensional Regions
Published by Springer Nature ,2005
What energy functions can be minimized via graph cuts?
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2004
Extending Pictorial Structures for Object Recognition
Published by British Machine Vision Association and Society for Pattern Recognition ,2004
Edge detection with embedded confidence
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2001
Some Network Flow Problems Solved with Pseudo-Boolean Programming
Operations Research, 1965