Simulation of talking faces in the human brain improves auditory speech recognition
Open Access
- 6 May 2008
- journal article
- research article
- Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences
- Vol. 105 (18) , 6747-6752
- https://doi.org/10.1073/pnas.0710826105
Abstract
Human face-to-face communication is essentially audiovisual. Typically, people talk to us face-to-face, providing concurrent auditory and visual input. Understanding someone is easier when there is visual input, because visual cues like mouth and tongue movements provide complementary information about speech content. Here, we hypothesized that, even in the absence of visual input, the brain optimizes both auditory-only speech and speaker recognition by harvesting speaker-specific predictions and constraints from distinct visual face-processing areas. To test this hypothesis, we performed behavioral and neuroimaging experiments in two groups: subjects with a face recognition deficit (prosopagnosia) and matched controls. The results show that observing a specific person talking for 2 min improves subsequent auditory-only speech and speaker recognition for this person. In both prosopagnosics and controls, behavioral improvement in auditory-only speech recognition was based on an area typically involved in face-movement processing. Improvement in speaker recognition was only present in controls and was based on an area involved in face-identity processing. These findings challenge current unisensory models of speech processing, because they show that, in auditory-only speech, the brain exploits previously encoded audiovisual correlations to optimize communication. We suggest that this optimization is based on speaker-specific audiovisual internal models, which are used to simulate a talking face.Keywords
This publication has 41 references indexed in Scilit:
- Optimal Sensorimotor Integration in Recurrent Cortical Networks: A Neural Implementation of Kalman FiltersJournal of Neuroscience, 2007
- Exploring the role of characteristic motion when learning new facesThe Quarterly Journal of Experimental Psychology, 2007
- Hereditary Prosopagnosia: the First Case SeriesCortex, 2007
- The fusiform face area: a cortical region specialized for the perception of facesPhilosophical Transactions Of The Royal Society B-Biological Sciences, 2006
- Images, Frames, and Connectionist HierarchiesNeural Computation, 2006
- Implicit Multisensory Associations Influence Voice RecognitionPLoS Biology, 2006
- First report of prevalence of non‐syndromic hereditary prosopagnosia (HPA)American Journal of Medical Genetics Part A, 2006
- Voice Recognition and Cross-Modal Responses to Familiar Speakers' Voices in ProsopagnosiaCerebral Cortex, 2005
- An Internal Model for Sensorimotor IntegrationScience, 1995
- Acoustic Theory of Speech Production: With Calculations Based on X-Ray Studies of Russian ArticulationsSlavic and East European Journal, 1961