Speech-to-image media conversion based on VQ and neural network
- 1 January 1991
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 2865-2868 vol.4
- https://doi.org/10.1109/icassp.1991.151000
Abstract
Automatic media conversion schemes from speech to a facial image and a construction of a real-time image synthesis system are presented. The purpose of this research is to realize an intelligent human-machine interface or intelligent communication system with synthesized human face images. A human face image is reconstructed on the display of a terminal using a 3-D surface model and texture mapping technique. Facial motion images are synthesized by transformation of the 3-D model. In the motion driving method, based on vector quantization and the neural network, the synthesized head image can appear to speak some given words and phrases naturally, in synchronization with voice signals from a speaker.Keywords
This publication has 5 references indexed in Scilit:
- Model-based analysis synthesis image coding (MBASIC) system for a person's faceSignal Processing: Image Communication, 1989
- The pixel machine: a parallel image computerACM SIGGRAPH Computer Graphics, 1989
- Parameterized Models for Facial AnimationIEEE Computer Graphics and Applications, 1982
- Characteristics of the mouth shape in the production of Japanese - Stroboscopic observation.Acoustical Science and Technology, 1982
- An Algorithm for Vector Quantizer DesignIEEE Transactions on Communications, 1980