Robust tracking and compression for video communication

Abstract
Principal components analysis has been studied by the computer vision community as a source of features for recognition of faces, objects and scenes. The use of the dominant principal components as "holistic" features for recognition has provided new insights into view invariant and illumination invariant recognition. Unfortunately, applications in object recognition generally require precise segmentation, and thus prove impractical. Nonetheless, under certain circumstances, principal components are optimal for reconstruction, and thus well suited for coding and compression of images. In such applications, precise tracking rather than segmentation, is required. Precise, stable tracking of faces renders principle components analysis well suited for video coding for video communications. In this paper we describe experiments with the use of principal components as a technique for coding and compression for video streams of talking heads. We describe a new robust tracking technique for normalizing the position and size of faces. We provide results of preliminary experiments with compression rates and image reconstruction quality using orthogonal basis coding for video communications. We show that a typical video sequence of a talking head can often be coded in less than 16 dimensions.

This publication has 7 references indexed in Scilit: