Abstract
Model-based video communication systems attempt to extract information about the three-dimensional structure of the scene to be transmitted. The relative motion between camera and objects can be a powerful depth cue, if the objects are rigid or almost rigid. We present a new algorithm for estimation of rigid body motion parameters and scene structure from monocular image sequences. A novel epipolar image transform is utilized to preserve the relevant information from mean squared displaced frame difference (DFD) surfaces and thus overcomes the inherent limitations of feature correspondence methods. Our algorithm conducts a coarse-to-fine search in 5-dimensional parameter space. Relative depth values are computed for each measurement window. Experimental results are presented to demonstrate the performance of the new algorithm.

This publication has 3 references indexed in Scilit: