Multimedia sensor fusion for intelligent camera control
- 23 December 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 655-662
- https://doi.org/10.1109/mfi.1996.572243
Abstract
A multisensor-based control system for an active pan/tilt/zoom camera is presented. Acoustic and visual information from multimedia sensors is used to locate the person currently speaking and track people moving about in a room. Pixel-level fusion of skin color with an image produced from interaural sound delay provides a simple means of detecting the face of the current speaker. For wider-scale surveillance tasks, moving targets are detected using color image differencing. Target data is fed to a behavior-based fuzzy control system which uses expert rules to aim the camera. Applications include video-conferencing, security, surveillance, and advances in human-computer interaction. The system has been implemented in on a multimedia PC equipped with a wide angle camera, a Canon VC-CI pan/tilt/zoom camera, and two microphones.Keywords
This publication has 16 references indexed in Scilit:
- Audio-visual sensor fusion system for intelligent sound sensingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Toward movement-invariant automatic lip-reading and speech recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Knowing who to listen to in speech recognition: visually guided beamformingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Reliable motion detection of small targets in video with low signal-to-clutter ratiosPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- A biomimetic system for localization and separation of multiple sound sourcesIEEE Transactions on Instrumentation and Measurement, 1995
- Lip-motion analysis for speech segmentation in noiseSpeech Communication, 1994
- A two-stage algorithm for determining talker location from linear microphone array dataComputer Speech & Language, 1992
- Fundamental limitations in passive time-delay estimation--Part II: Wide-band systemsIEEE Transactions on Acoustics, Speech, and Signal Processing, 1984
- The least squares estimation of time delay and its use in signal detectionIEEE Transactions on Acoustics, Speech, and Signal Processing, 1978
- The generalized correlation method for estimation of time delayIEEE Transactions on Acoustics, Speech, and Signal Processing, 1976