Importance of temporal-envelope cues in consonant recognition

1 March 1999

journal article
research article
Published by Acoustical Society of America (ASA) in The Journal of the Acoustical Society of America

Vol. 105 (3) , 1801-1809
https://doi.org/10.1121/1.426718

Abstract

The role of different modulation frequencies in the speech envelope were studied by means of the manipulation of vowel–consonant–vowel (VCV) syllables. The envelope of the signal was extracted from the speech and the fine-structure was replaced by speech-shaped noise. The temporal envelopes in every critical band of the speech signal were notch filtered in order to assess the relative importance of different modulation frequency regions between 0 and 20 Hz. For this purpose notch filters around three center frequencies (8, 12, and 16 Hz) with three different notch widths (4-, 8-, and 12-Hz wide) were used. These stimuli were used in a consonant-recognition task in which ten normal-hearing subjects participated, and their results were analyzed in terms of recognition scores. More qualitative information was obtained with a multidimensional scaling method (INDSCAL) and sequential information analysis (SINFA). Consonant recognition is very robust for the removal of certain modulation frequency areas. Only when a wide notch around 8 Hz is applied does the speech signal become heavily degraded. As expected, the voicing information is lost, while there are different effects on plosiveness and nasality. Even the smallest filtering has a substantial effect on the transfer of the plosiveness feature, while on the other hand, filtering out only the low-modulation frequencies has a substantial effect on the transfer of nasality cues.

Keywords

This publication has 10 references indexed in Scilit:

Speech Recognition with Primarily Temporal Cues
Science, 1995
Effect of temporal envelope smearing on speech reception
The Journal of the Acoustical Society of America, 1994
Temporal information in speech: acoustic, auditory and linguistic aspects
Philosophical Transactions Of The Royal Society B-Biological Sciences, 1992
Derivation of auditory filter shapes from notched-noise data
Hearing Research, 1990
Speech waveform envelope cues for consonant recognition
The Journal of the Acoustical Society of America, 1987
A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria
The Journal of the Acoustical Society of America, 1985
Consonant confusions in noise: a study of perceptual features
The Journal of the Acoustical Society of America, 1973
Vowel Spectra, Vowel Spaces, and Vowel Identification
The Journal of the Acoustical Society of America, 1970
Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition
Psychometrika, 1970
Note on the construction of digram-balanced Latin squares.
Psychological Bulletin, 1969