Personalized face and speech communication over the Internet

13 November 2002

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 37-44
https://doi.org/10.1109/vr.2001.913768

Abstract

We present our system for personalized face and speech communication over the Internet. The overall system consists of three parts: the cloning of real human faces to use as the representative avatars; the Networked Virtual Environment System performing the basic task of network and device management; and the speech system which includes a text-to-speech engine and a real time phoneme extraction engine from natural speech. The combination of these three elements provides a system to allow real humans, represented by their virtual counterparts, to communicate with each other even when they are geographically remote. In addition to this, all elements present use MPEG-4 as a common communication and animation standard and were designed and tested on the Windows operating system (OS). The paper presents the main aim of the work, the methodology and the resulting communication system.

Keywords

This publication has 29 references indexed in Scilit:

Lip synchronization using linear predictive analysis
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Generating Animatable 3D Virtual Humans from Photographs
Computer Graphics Forum, 2000
Fast head modeling for animation
Image and Vision Computing, 2000
Lip movement synthesis from speech based on Hidden Markov Models
Speech Communication, 1998
Reading between the lines—a method for extracting dynamic 3D with texture
Published by Association for Computing Machinery (ACM) ,1997
Lip synchronization for animation
Published by Association for Computing Machinery (ACM) ,1997
Realistic modeling for facial animation
Published by Association for Computing Machinery (ACM) ,1995
Modeling Coarticulation in Synthetic Visual Speech
Published by Springer Nature ,1993
Animating speech: an automated approach using speech synthesised by rules
The Visual Computer, 1988
Direct estimation of the vocal tract shape by inverse filtering of acoustic speech waveforms
IEEE Transactions on Audio and Electroacoustics, 1973