Learning flexible sprites in video layers

Abstract
See a PPT file with videos at www.research.microsoft.com/users/jojic/FlexiblesSprites.htm We propose a technique for automatically learn- ing layers of "flexible sprites" - probabilistic 2- dimensional appearance maps and masks of moving, occluding objects. The model explains each input im- age as a layered composition of flexible sprites. A variational expectation maximization algorithm is used to learn a mixture of sprites from a video sequence. For each input image, probabilistic inference is used to infer the sprite class, translation, mask values and pixel intensities (including obstructed pixels) in each layer. Exact inference is intractable, but we show how a variational inference technique can be used to process 320 × 240 images at 1 frame/second. The only inputs to the learning algorithm are the video sequence, the number of layers and the number of flexible sprites. We give results on several tasks, including summariz- ing a video sequence with sprites, point-and-click video stabilization, and point-and-click object removal. In addition, this model could be used for very low bitrate video compression.

This publication has 7 references indexed in Scilit: