The use of visible speech cues for improving auditory detection of spoken sentences
Top Cited Papers
- 1 September 2000
- journal article
- conference paper
- Published by Acoustical Society of America (ASA) in The Journal of the Acoustical Society of America
- Vol. 108 (3) , 1197-1208
- https://doi.org/10.1121/1.1288668
Abstract
Classic accounts of the benefits of speechreading to speech recognition treat auditory and visual channels as independent sources of information that are integrated fairly early in the speech perception process. The primary question addressed in this study was whether visible movements of the speech articulators could be used to improve the detection of speech in noise, thus demonstrating an influence of speechreading on the ability to detect, rather than recognize, speech. In the first experiment, ten normal-hearing subjects detected the presence of three known spoken sentences in noise under three conditions: auditory-only auditory plus speechreading with a visually matched sentence and auditory plus speechreading with a visually unmatched sentence When the speechread sentence matched the target sentence, average detection thresholds improved by about 1.6 dB relative to the auditory condition. However, the amount of threshold reduction varied significantly for the three target sentences (from 0.8 to 2.2 dB). There was no difference in detection thresholds between the condition and the condition. In a second experiment, the effects of visually matched orthographic stimuli on detection thresholds was examined for the same three target sentences in six subjects who participated in the earlier experiment. When the orthographic stimuli were presented just prior to each trial, average detection thresholds improved by about 0.5 dB relative to the condition. However, unlike the condition, the detection improvement due to orthography was not dependent on the target sentence. Analyses of correlations between area of mouth opening and acoustic envelopes derived from selected spectral regions of each sentence (corresponding to the wide-band speech, and first, second, and third formant regions) suggested that threshold reduction may be determined by the degree of auditory-visual temporal coherence, especially between the area of lip opening and the envelope derived from mid- to high-frequency acoustic energy. Taken together, the data (for these sentences at least) suggest that visual cues derived from the dynamic movements of the fact during speech production interact with time-aligned auditory cues to enhance sensitivity in auditory detection. The amount of visual influence depends in part on the degree of correlation between acoustic envelopes and visible movement of the articulators.
Keywords
This publication has 34 references indexed in Scilit:
- Coherence masking protection in speech sounds: The role of formant synchronyPerception & Psychophysics, 1997
- Auditory supplements to speechreading: Combining amplitude envelope cues from different spectral regions of speechThe Journal of the Acoustical Society of America, 1994
- Phonetic recoding of print and its effect on the detection of concurrent speech in amplitude-modulated noiseCognition, 1991
- Evaluating the articulation index for auditory–visual inputThe Journal of the Acoustical Society of America, 1991
- Can speech perception be influenced by simultaneous presentation of print?Journal of Memory and Language, 1988
- Detectability of auditory signals presented without defined observation intervalsThe Journal of the Acoustical Society of America, 1976
- Interval of Time Uncertainty in Auditory DetectionThe Journal of the Acoustical Society of America, 1961
- Memory for Waveform and Time Uncertainty in Auditory DetectionThe Journal of the Acoustical Society of America, 1961
- Development of a Quantitative Description of Vowel ArticulationThe Journal of the Acoustical Society of America, 1955
- An Analysis of Perceptual Confusions Among Some English ConsonantsThe Journal of the Acoustical Society of America, 1955