A self-similar stack model for human and machine vision

Abstract
A new model is proposed that not only exhibits the major properties of primate spatial vision but also has a structure that can be implemented efficiently in a machine vision system. The model is based on a self-similar stack structure with a spatial resolution that varies with eccentricity. It correctly reproduces the visual cortical mapping function, yet it has the important attribute that it can produce invariant responses to local changes in the size and position of image features. By proposing a novel purpose for cortical “bar-detectors”, the model can also produce invariance to more general distortions. The structure of the model allows efficient hierarchical search to be made and it naturally embraces the concept of “attention area”. Exploitation of this model has already confirmed these properties and has also revealed its robust ability to control the focus and gain of machine vision systems.