SEEMORE: Combining Color, Shape, and Texture Histogramming in a Neurally Inspired Approach to Visual Object Recognition
- 1 May 1997
- journal article
- Published by MIT Press in Neural Computation
- Vol. 9 (4) , 777-804
- https://doi.org/10.1162/neco.1997.9.4.777
Abstract
Severe architectural and timing constraints within the primate visual system support the conjecture that the early phase of object recognition in the brain is based on a feedforward feature-extraction hierarchy. To assess the plausibility of this conjecture in an engineering context, a difficult three-dimensional object recognition domain was developed to challenge a pure feedforward, receptive-field based recognition model called SEEMORE. SEEMORE is based on 102 viewpoint-invariant nonlinear filters that as a group are sensitive to contour, texture, and color cues. The visual domain consists of 100 real objects of many different types, including rigid (shovel), nonrigid (telephone cord), and statistical (maple leaf cluster) objects and photographs of complex scenes. Objects were in dividually presented in color video images under normal room lighting conditions. Based on 12 to 36 training views, SEEMORE was required to recognize unnormalized test views of objects that could vary in position, orientation in the image plane and in depth, and scale (factor of 2); for non rigid objects, recognition was also tested under gross shape deformations. Correct classification performance on a test set consisting of 600 novel object views was 97 percent (chance was 1 percent) and was comparable for the subset of 15 nonrigid objects. Performance was also measured under a variety of image degradation conditions, including partial occlusion, limited clutter, color shift, and additive noise. Generalization behavior and classification errors illustrate the emergence of several striking natural shape categories that are not explicitly encoded in the dimensions of the feature space. It is concluded that in the light of the vast hardware resources available in the ventral stream of the primate visual system relative to those exercised here, the appealingly simple feature-space conjecture remains worthy of serious consideration as a neurobiological model.Keywords
This publication has 22 references indexed in Scilit:
- Size and position invariance of neuronal responses in monkey inferotemporal cortexJournal of Neurophysiology, 1995
- Neural ensemble coding in inferior temporal cortexJournal of Neurophysiology, 1994
- View-dependent object recognition by monkeysCurrent Biology, 1994
- Multidimensional indexing for recognizing visual shapesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1994
- Neuronal selectivities to complex object features in the ventral visual pathway of the macaque cerebral cortexJournal of Neurophysiology, 1994
- Distortion invariant object recognition in the dynamic link architectureIEEE Transactions on Computers, 1993
- Orientation dependence in the recognition of familiar and novel views of three-dimensional objectsVision Research, 1992
- Surface versus edge-based determinants of visual recognitionCognitive Psychology, 1988
- Neocognitron: A neural network model for a mechanism of visual pattern recognitionIEEE Transactions on Systems, Man, and Cybernetics, 1983
- Visual properties of neurons in inferotemporal cortex of the Macaque.Journal of Neurophysiology, 1972