Covariance scaled sampling for monocular 3D body tracking

Abstract
International audienceWe present a method for recovering 3D human body motion from monocular video sequences using robust image matching, joint limits and non-self-intersection constraints, and a new sample-and-refine search strategy guided by rescaled cost-function covariances. Monocular 3D body tracking is challenging: for reliable tracking at least 30 joint parameters need to be estimated, subject to highly nonlinear physical constraints; the problem is chronically ill conditioned as about 1/3 of the d.o.f. (the depth-related ones) are almost unobservable in any given monocular image; and matching an imperfect, highly flexible self-occluding model to cluttered image features is intrinsically hard. To reduce correspondence ambiguities we use a carefully designed robust matching-cost metric that combines robust optical flow, edge energy, and motion boundaries. Even so, the ambiguity, nonlinearity and non-observability make the parameter-space cost surface multi-modal, unpredictable and ill conditioned, so minimizing it is difficult. We discuss the limitations of CONDENSATION-like samplers, and introduce a novel hybrid search algorithm that combines inflated-covariance-scaled sampling and continuous optimization subject to physical constraints. Experiments on some challenging monocular sequences show that robust cost modelling, joint and self-intersection constraints, and informed sampling are all essential for reliable monocular 3D body tracking

This publication has 22 references indexed in Scilit: