Identification of story units in audio-visual sequences by joint audio and video processing
- 1 January 1998
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. 1, 363-367
- https://doi.org/10.1109/icip.1998.723500
Abstract
A novel technique, which uses a joint audio-visual analysis for scene identification and characterization, is proposed. The paper defines four different scene types: dialogues, stories, actions, and generic scenes. It then explains how any audio-visual material can be decomposed into a series of scenes obeying the previous classification, by properly analyzing and then combining the underlying audio and visual information. A rule-based procedure is defined for such purpose. Before such rule-based decision can take place, a series of low-level pre-processing tasks are suggested to adequately measure audio and visual correlations. As far as visual information is concerned, it is proposed to measure the similarities between non-consecutive shots using a learning vector quantization approach. An outlook on a possible implementation strategy for the overall scene identification task is suggested, and validated through a series of experimental simulations on real audio-visual data.Keywords
This publication has 5 references indexed in Scilit:
- Video content characterization and compaction for digital library applicationsPublished by SPIE-Intl Soc Optical Eng ,1997
- Identification of successive correlated camera shots using audio and video informationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1997
- Statistical approach to scene change detectionPublished by SPIE-Intl Soc Optical Eng ,1995
- Automatic partitioning of full-motion videoMultimedia Systems, 1993
- The self-organizing mapProceedings of the IEEE, 1990