Locating and tracking facial speech features
- 1 January 1996
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. 1 (10514651) , 652-656 vol.1
- https://doi.org/10.1109/icpr.1996.546105
Abstract
This paper describes a robust method for extracting visual speech information from the shape of lips to be used for an automatic speechreading (lipreading) systems. Lip deformation is modelled by a statistically based deformable contour model which learns typical lip deformation from a training set. The main difficulty in locating and tracking lips consists of finding dominant image features for representing the lip contours. We describe the use of a statistical profile model which learns dominant image features from a training set. The model captures global intensity variation due to different illumination and different skin reflectance as well as intensity changes at the inner lip contour due to mouth opening and visibility of teeth and tongue. The method is validated for locating and tracking lip movements on a database of a broad variety of speakers.Keywords
This publication has 12 references indexed in Scilit:
- Speechreading using shape and intensity informationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- A unified approach to coding and interpreting face imagesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Nonlinear manifold learning for visual speech recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Visual speech recognition using active shape models and hidden Markov modelsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1996
- Real-time lip tracking for audio-visual speech recognition applicationsPublished by Springer Nature ,1996
- Learning to recognise talking facesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1996
- Use of active shape models for locating structures in medical imagesImage and Vision Computing, 1994
- Analysis and synthesis of facial image sequences using physical and anatomical modelsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1993
- Lipreading and audio-visual speech perceptionPhilosophical Transactions Of The Royal Society B-Biological Sciences, 1992
- Integration of acoustic and visual speech signals using neural networksIEEE Communications Magazine, 1989