Learning early-vision computations

Abstract
In recent studies [D. Marr, Vision (Freeman, San Francisco, Calif., 1982); B. Horn, Robot Vision (MIT Press, Cambridge, Mass., 1986)] algorithms were developed for the solution of several early-vision problems such as edge detection; interpolation; surface reconstruction; stereo vision; optical-flow computation; and determination of shape from shading, texture, patterns, and motion. Some of the characteristics of these algorithms are as follows: (a) The algorithms are based on models that describe the relationship between the desired variables and the image measurements. These models contain parameters that are usually determined in an ad hoc way or through experimentation. In other words, the modeling process must make some assumptions that both restrict the applicability of the model and result in parameters whose value is to be determined before the algorithm can be used; to our knowledge no satisfactory systematic way of computing these parameters has been described. (b) The algorithms do not improve with experience; that is, the algorithms are not equipped with the necessary machinery so that they can improve themselves automatically (learn) from examples of previous determinations. (c) The algorithms usually fail when the quantity to be computed is a discontinuous function. We present a unified theory for the solution of the above early-vision problems. The underlying mathematical theory is one of regularization augmented in order to account for discontinuities [Tech. Rep. CAR-TR-356 (University of Maryland, College Park, Md., 1988)]. In addition, the parameters involved in the model are learned in an optimal way through adaptive estimation. Finally, we show empirical results from the application of the theory to the one-dimensional surface interpolation problem. This study is motivated by the results of Poggio et al. [Nature (London) 317, 314 (1985)].

This publication has 10 references indexed in Scilit: