Abstract
A method is proposed to reduce the ambiguity of vowels in connected speech by normalizing the coarticulation effects. The method is applied to vowels in phonetic environments where great ambiguity would be likely to occur, taking as their features the 1st and 2nd formant trajectories. The separability between vowel clusters is found to be greatly improved for the vowel samples. Distribution of the vowels on a feature plane characterized by this method seems to reflect their perceptual nature when presented to listeners without isolation from their phonetic environments. Apparently, this method is useful for automatic speech recognition; some possible mechanisms underlying dynamic aspects of human speech recognition are inferred.

This publication has 1 reference indexed in Scilit: