A perceptual model of vowel recognition based on the auditory representation of American English vowels

1 April 1986

journal article
research article
Published by Acoustical Society of America (ASA) in The Journal of the Acoustical Society of America

Vol. 79 (4) , 1086-1100
https://doi.org/10.1121/1.393381

Abstract

A quantitative perceptual model of human vowel recognition based upon psychoacoustic and speech perception data is described. At an intermediate auditory stage of processing, the specific bark difference level of the model represents the pattern of peripheral auditory excitation as the distance in critical bands (barks) between neighboring formants and between the fundamental frequency (F0) and first formant (F1). At a higher, phonetic stage of processing, represented by the critical bark difference level of the model, the transformed vowels may be dichotomously classified based on whether the difference between formants in each dimension falls within or exceeds the critical distance of 3 bark for the spectral center of gravity effect [Chistovich et al., Hear. Res. 1, 185-195 (1979)]. Vowel transformations and classifications correspond well to several major phonetic dimensions and features by which vowels are perceived and traditionally classified. The F1-F0 dimension represents vowel height, and high vowels have F1-F0 differences within 3 bark. The F3-F2 dimension corresponds to vowel place of articulation, and front vowels have F3-F2 differences of less than 3 bark. As an inherent, speaker-independent normalization procedure, the model provides excellent vowel clustering while it greatly reduces between-speaker variability. It offers robust normalization through feature classification because gross binary categorization allows for considerable acoustic variability. There was generally less formant and bark difference variability for closely spaced formants than for widely spaced formants. These findings agree with independently observed perceptual results and support Stevens'' quantal theory of vowel production and perceptual constraints on production predicted from the critical bark difference level of the model.

This publication has 5 references indexed in Scilit:

Aspects of a model of the auditory representation of american english vowels
Speech Communication, 1985
Vowel identification: Orthographic, perceptual, and acoustic aspects
The Journal of the Acoustical Society of America, 1982
Evaluation of vowel normalization procedures
The Journal of the Acoustical Society of America, 1980
Two-formant Models, Pitch and Vowel Perception
Published by Elsevier ,1975
Toward the Specification of Speech
The Journal of the Acoustical Society of America, 1950