Look who's talking: speaker detection using video and audio correlation
- 7 November 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. 3, 1589-1592
- https://doi.org/10.1109/icme.2000.871073
Abstract
The visual motion of the mouth and the corresponding audio data generated when a person speaks are highly correlated. This fact has been exploited for lip/speech-reading and for improving speech recognition. We describe a method of automatically detecting a talking person (both spatially and temporally) using video and audio data from a single microphone. The audio-visual correlation is learned using a time-delayed neural network, which is then used to perform a spatio-temporal search for a speaking person. Applications include videoconferencing, video indexing and improving human-computer interaction (HCI). An example HCI application is provided.Keywords
This publication has 13 references indexed in Scilit:
- Vision-based speaker detection using Bayesian networksPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Neural network lipreading system for improved speech recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- "Eigenlips" for robust speech recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- A virtual mirror interface using real-time robust face trackingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Voice puppetryPublished by Association for Computing Machinery (ACM) ,1999
- Neural network-based face detectionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1998
- X Vision: A Portable Substrate for Real-Time Vision ApplicationsComputer Vision and Image Understanding, 1998
- Robust text-independent speaker identification using Gaussian mixture speaker modelsIEEE Transactions on Speech and Audio Processing, 1995
- Phoneme recognition using time-delay neural networksIEEE Transactions on Acoustics, Speech, and Signal Processing, 1989
- Recurrence Plots of Dynamical SystemsEurophysics Letters, 1987