Visual Recognition and Inference Using Dynamic Overcomplete Sparse Learning
- 1 September 2007
- journal article
- Published by MIT Press in Neural Computation
- Vol. 19 (9) , 2301-2352
- https://doi.org/10.1162/neco.2007.19.9.2301
Abstract
We present a hierarchical architecture and learning algorithm for visual recognition and other visual inference tasks such as imagination, reconstruction of occluded images, and expectation-driven segmentation. Using properties of biological vision for guidance, we posit a stochastic generative world model and from it develop a simplified world model (SWM) based on a tractable variational approximation that is designed to enforce sparse coding. Recent developments in computational methods for learning overcomplete representations (Lewicki & Sejnowski, 2000; Teh, Welling, Osindero, & Hinton, 2003) suggest that overcompleteness can be useful for visual tasks, and we use an overcomplete dictionary learning algorithm (Kreutz-Delgado, et al., 2003) as a preprocessing stage to produce accurate, sparse codings of images. Inference is performed by constructing a dynamic multilayer network with feedforward, feedback, and lateral connections, which is trained to approximate the SWM. Learning is done with a variant of the back-propagation-through-time algorithm, which encourages convergence to desired states within a fixed number of iterations. Vision tasks require large networks, and to make learning efficient, we take advantage of the sparsity of each layer to update only a small subset of elements in a large weight matrix at each iteration. Experiments on a set of rotated objects demonstrate various types of visual inference and show that increasing the degree of overcompleteness improves recognition performance in difficult scenes with occluded objects in clutter.Keywords
This publication has 49 references indexed in Scilit:
- A Fast Learning Algorithm for Deep Belief NetsNeural Computation, 2006
- Restoring partly occluded patterns: a neural network modelNeural Networks, 2005
- Sparse coding with an overcomplete basis set: A strategy employed by V1?Published by Elsevier ,2003
- Hierarchical Bayesian inference in the visual cortexJournal of the Optical Society of America A, 2003
- Neocognitron: A new algorithm for pattern recognition tolerant of deformations and shifts in positionPublished by Elsevier ,2003
- Generative models for discovering sparse distributed representationsPhilosophical Transactions Of The Royal Society B-Biological Sciences, 1997
- Borders of Multiple Visual Areas in Humans Revealed by Functional Magnetic Resonance ImagingScience, 1995
- Learning Invariance from Transformation SequencesNeural Computation, 1991
- An Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network TrajectoriesNeural Computation, 1990
- On the Distinction Between the Conditional Probability and the Joint Probability Approaches in the Specification of Nearest-Neighbour SystemsBiometrika, 1964