Stratification of three-dimensional vision: projective, affine, and metric representations
- 1 March 1995
- journal article
- Published by Optica Publishing Group in Journal of the Optical Society of America A
- Vol. 12 (3) , 465-484
- https://doi.org/10.1364/josaa.12.000465
Abstract
A conceptual framework is provided in which to think of the relationships between the three-dimensional structure of physical space and the geometric properties of a set of cameras that provide pictures from which measurements can be made. We usually think of physical space as being embedded in a three-dimensional Euclidean space, in which measurements of lengths and angles do make sense. It turns out that for artificial systems, such as robots, this is not a mandatory viewpoint and that it is sometimes sufficient to think of physical space as being embedded in an affine or even a projective space. The question then arises of how to relate these models to image measurements and to geometric properties of sets of cameras. It is shown that, in the case of two cameras, a stereo rig, the projective structure of the world can be recovered as soon as the epipolar geometry of the stereo rig is known and that this geometry is summarized by a single 3 × 3 matrix, which is called the fundamental matrix. The affine structure can then be recovered if to this information is added a projective transformation between the two images that is induced by the plane at infinity. Finally, the Euclidean structure (up to a similitude) can be recovered if to these two elements is added the knowledge of two conics (one for each camera) that are the images of the absolute conic, a circle of radius in the plane at infinity. In all three cases it is shown how the three-dimensional information can be recovered directly from the images without explicit reconstruction of the scene structure. This defines a natural hierarchy of geometric structures, a set of three strata that is overlaid upon the physical world and that is shown to be recoverable by simple procedures that rely on two items, the physical space itself together with possibly, but not necessarily, some a priori information about it, and some voluntary motions of the set of cameras.
Keywords
This publication has 7 references indexed in Scilit:
- A theory of self-calibration of a moving cameraInternational Journal of Computer Vision, 1992
- Affine structure from motionJournal of the Optical Society of America A, 1991
- Properties of essential matricesInternational Journal of Imaging Systems and Technology, 1990
- Motion from point matches: Multiplicity of solutionsInternational Journal of Computer Vision, 1990
- Some properties of the E matrix in two-view motion estimationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1989
- Adaptive changes in perceptual responses and visuomanual coordination during exposure to visual metrical distortionVision Research, 1985
- A computer algorithm for reconstructing a scene from two projectionsNature, 1981