Multimedia content analysis-using both audio and visual clues
Top Cited Papers
- 1 November 2000
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Signal Processing Magazine
- Vol. 17 (6) , 12-36
- https://doi.org/10.1109/79.888862
Abstract
Multimedia content analysis refers to the computerized understanding of the semantic meanings of a multimedia document, such as a video sequence with an accompanying audio track. With a multimedia document, its semantics are embedded in multiple forms that are usually complimentary of each other, Therefore, it is necessary to analyze all types of data: image frames, sound tracks, texts that can be extracted from image frames, and spoken words that can be deciphered from the audio track. This usually involves segmenting the document into semantically meaningful units, classifying each unit into a predefined scene type, and indexing and summarizing the document for efficient retrieval and browsing. We review advances in using audio and visual information jointly for accomplishing the above tasks. We describe audio and visual features that can effectively characterize scene content, present selected algorithms for segmentation and classification, and review some testbed systems for video archiving and retrieval. We also describe audio and visual descriptors and description schemes that are being considered by the MPEG-7 standard for multimedia content description.Keywords
This publication has 56 references indexed in Scilit:
- Neural network based model for classification of music typePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Hierarchical classification of audio data for archiving and retrievingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1999
- Comparison of automatic shot boundary detection algorithmsPublished by SPIE-Intl Soc Optical Eng ,1998
- Classification of audio signals using statistical features on time and wavelet transform domainsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1998
- Neural network-based face detectionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1998
- Identification of story units in audio-visual sequences by joint audio and video processingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1998
- Robust and Efficient Shape Indexing through Curvature Scale SpacePublished by British Machine Vision Association and Society for Pattern Recognition ,1996
- Automatic recognition of film genresPublished by Association for Computing Machinery (ACM) ,1995
- Automatic partitioning of full-motion videoMultimedia Systems, 1993
- Eigenfaces for RecognitionJournal of Cognitive Neuroscience, 1991