Abstract
Perception models based on different kinds of acoustic data were compared with respect to their capacity to predict perceptual confusions between the Swedish stops [b,d,d,g] in systematically varied vowel contexts. Fragments of VC:V utterances read by a male speaker were presented to listeners. The resulting confusions were especially numerous between short stimulus segments following stop release, and formed a regular pattern depending mainly on the acute/grave dimension of the following vowel. The acoustic distances calculated were based on: (1) filter band spectra; (2) F2 and F3 at the CV boundary and in the middle of the following vowel; (3) the duration of the burst (= transient + noise section). Both the spectrum-based and the formant-based models provided measures of acoustic distance (dissimilarity) that revealed regular patterns. However, the predictive capacity of both models was improved by including the time-varying properties of the stimuli in the distance measures. The highest correlation between predicted and observed percent confusions, r = 0.85, was obtained with the formant-based model in combination with burst length data. The asymmetries in the listener''s confusions were also shown to be predictable, given acoustic data on the following vowel.

This publication has 3 references indexed in Scilit: