Abstract
A modular connectionist network model for capturing invariant relationships between the acoustic signal and phonetic categories is developed. The model addresses variations in the acoustic manifestation of phonemes as a function of loudness, speaker identity, speaking rate, and phonetic context. These sources of variation are referred to separate specialized network modules. Each module transforms the representation of its input signal in order to normalize the effect of different source variables. Components of the model have been tested on isolated words for speaker-adaptive vowel recognition (97%) and context-dependent vowel recognition (99.7%). A model integrating amplitude-normalization, speaker-normalization, and context-modulation for continuous speech recognition is under development.

This publication has 3 references indexed in Scilit: