An audio-video front-end for multimedia applications

8 November 2002

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

Vol. 2 (1062922X) , 786-791
https://doi.org/10.1109/icsmc.2000.885945

Abstract

Applications such as video gaming, virtual reality, multimodal user interfaces and videoconferencing, require systems that can locate and track persons in a room through a combination of visual and audio cues, enhance the sound that they produce, and perform identification. We describe the development of a particular multimodal sensor fusion system that is portable, runs in real time and achieves these objectives. The system employs novel algorithms for acoustical source location, video-based person tracking and overall system control, which are also described.

Keywords

This publication has 8 references indexed in Scilit:

W/sup 4/: Who? When? Where? What? A real time system for detecting and tracking people
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Tracking multiple talkers using microphone-array measurements
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Smart videoconferencing
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Active source location and beamforming
The Journal of the Acoustical Society of America, 2000
Exact solutions for the problem of source location from measured time differences of arrival
The Journal of the Acoustical Society of America, 1999
Closed-form least-squares source location estimation from range-difference measurements
IEEE Transactions on Acoustics, Speech, and Signal Processing, 1987
An Improved Algorithm for Discrete $l_1 $ Linear Approximation
SIAM Journal on Numerical Analysis, 1973
Eyes and Ears for Computers
Proceedings of the IRE, 1962