Lip synchronization using linear predictive analysis

7 November 2002

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

Vol. 2, 1077-1080
https://doi.org/10.1109/icme.2000.871547

Abstract

Linear Predictive analysis is a widely used technique for speech analysis and encoding. In this paper, we discuss the issues involved in its application to phoneme extraction and lip synchronization. The LP analysis results in a set of reflection coefficients that are closely related to the vocal tract shape. Since the vocal tract shape can be correlated with the phoneme being spoken, LP analysis can be directly applied to phoneme extraction. We use neural networks to train and classify the reflection coefficients into a set of vowels. In addition, average energy is used to take care of vowel-vowel and vowel-consonant transitions, whereas the zero crossing information is used to detect the presence of fricatives. We directly apply the extracted phoneme information to our synthetic 3D face model. The proposed method is fast, easy to implement, and adequate for real time speech animation. As the method does not rely on language structure or speech recognition, it is language independent. Moreover, the method is speaker independent. It can be applied to lip synchronization for entertainment applications and avatar animation in virtual environments.

Keywords

This publication has 6 references indexed in Scilit:

MPEG-4 compatible faces from orthogonal photos
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2003
Facial deformations for MPEG-4
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Lip movement synthesis from speech based on Hidden Markov Models
Speech Communication, 1998
MPEG-4: Audio/video and synthetic graphics/audio for mixed media
Signal Processing: Image Communication, 1997
Lip synchronization for animation
Published by Association for Computing Machinery (ACM) ,1997
Direct estimation of the vocal tract shape by inverse filtering of acoustic speech waveforms
IEEE Transactions on Audio and Electroacoustics, 1973