Lipreading from color video

1 August 1997

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Image Processing

Vol. 6 (8) , 1192-1195
https://doi.org/10.1109/83.605417

Abstract

We have designed and implemented a lipreading system that recognizes isolated words using only color video of human lips (without acoustic data). The system performs video recognition using "snakes" to extract visual features of geometric space, Karhunen-Loeve transform (KLT) to extract principal components in the color eigenspace, and hidden Markov models (HMM's) to recognize the combined visual features sequences. With the visual information alone, we were able to achieve 94% accuracy for ten isolated words.

Keywords

This publication has 15 references indexed in Scilit:

Neural network lipreading system for improved speech recognition
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2003
Computer lipreading for improved accuracy in automatic speech recognition
IEEE Transactions on Speech and Audio Processing, 1996
A neural network-based stochastic active contour model (NNS-SNAKE) for contour finding of distinct features
IEEE Transactions on Image Processing, 1995
Finite-element methods for active contour models and balloons for 2-D and 3-D images
Published by Institute of Electrical and Electronics Engineers (IEEE) ,1993
Active Shape Models - 'smart snakes'
Published by British Machine Vision Association and Society for Pattern Recognition ,1992
Automatic lipreading by optical‐flow analysis
Systems and Computers in Japan, 1991
Eigenfaces for Recognition
Journal of Cognitive Neuroscience, 1991
A voice activated car audio system
IEEE Transactions on Consumer Electronics, 1991
Integration of acoustic and visual speech signals using neural networks
IEEE Communications Magazine, 1989
An improved automatic lipreading system to enhance speech recognition
Published by Association for Computing Machinery (ACM) ,1988