Phoneme discrimination using connectionist networks
- 1 April 1990
- journal article
- research article
- Published by Acoustical Society of America (ASA) in The Journal of the Acoustical Society of America
- Vol. 87 (4) , 1753-1772
- https://doi.org/10.1121/1.399424
Abstract
The application of connectionist networks to speech recognition is assessed using a set of eight representative phonetic discrimination problems chosen with respect to a theory of phonetics. A connectionist network model called the temporal flow model (TFM) is defined which represents temporal relationships using delay links and permits general patterns of connectivity. It is argued that the model has properties appropriate for time varying signals such as speech. Networks are trained using gradient descent methods of iterative nonlinear optimization to reduce the mean-squared error between the actual and the desired response of the output units. Separate network solutions are demonstrated for all eight phonetic discrimination problems for one male speaker. The network solutions are analyzed carefully and are shown in every case to make use of known acoustic phonetic cues. The network solutions vary in the degree to which they make use of context-dependent cues to achieve phoneme recognition. The network solutions were tested on data not used for training and achieved an averge accuracy of 99.5%. It is concluded that acoustic phonetic speech recognition can be accomplished using connectionist networks.This publication has 4 references indexed in Scilit:
- Learning the hidden structure of speechThe Journal of the Acoustical Society of America, 1988
- Neural computation by concentrating information in time.Proceedings of the National Academy of Sciences, 1987
- Neural dynamics of word recognition and recall: Attentional priming, learning, and resonance.Psychological Review, 1986
- A Physiological Theory of PhoneticsJournal of Speech and Hearing Research, 1966