Analysis and synthesis of the three-dimensional movements of the head, face, and hand of a speaker using cued speech
- 1 August 2005
- journal article
- research article
- Published by Acoustical Society of America (ASA) in The Journal of the Acoustical Society of America
- Vol. 118 (2) , 1144-1153
- https://doi.org/10.1121/1.1944587
Abstract
In this paper we present efforts for characterizing the three dimensional (3-D) movements of the right hand and the face of a French female speaker during the audiovisual production of cued speech. The 3-D trajectories of 50 hand and 63 facial flesh points during the production of 238 utterances were analyzed. These utterances were carefully designed to cover all possible diphones of the French language. Linear and nonlinear statistical models of the articulations and the postures of the hand and the face have been developed using separate and joint corpora. Automatic recognition of hand and face postures at targets was performed to verify a posteriori that key hand movements and postures imposed by cued speech had been well realized by the subject. Recognition results were further exploited in order to study the phonetic structure of cued speech, notably the phasing relations between hand gestures and sound production. The hand and face gestural scores are studied in reference with the acoustic segmentation. A first implementation of a concatenative audiovisual text-to-cued speech synthesis system is finally described that employs this unique and extensive data on cued speech in action.Keywords
This publication has 13 references indexed in Scilit:
- Tracking talking faces with shape and appearance modelsSpeech Communication, 2004
- A pilot study of temporal organization in Cued Speech production of French syllables: rules for a Cued Speech synthesizerSpeech Communication, 2004
- Three-dimensional linear articulatory modeling of tongue, lips and face, based on MRI and video imagesJournal of Phonetics, 2002
- Phonology Acquired through the Eyes and Spelling in Deaf ChildrenJournal of Experimental Child Psychology, 2000
- Development of speechreading supplements based on automatic speech recognitionIEEE Transactions on Biomedical Engineering, 2000
- Quantitative association of vocal-tract and facial behaviorSpeech Communication, 1998
- Unit selection in a concatenative speech synthesis system using a large speech databasePublished by Institute of Electrical and Electronics Engineers (IEEE) ,1996
- TEMPORAL DISSOCIATION OF MOTOR RESPONSES AND SUBJECTIVE AWARENESSBrain, 1991
- Visemes Observed by Hearing-Impaired and Normal-Hearing Adult ViewersJournal of Speech, Language, and Hearing Research, 1985
- Cued Speech and the Reception of Spoken LanguageJournal of Speech, Language, and Hearing Research, 1982