Improving temporal representation in TDNN structure for phoneme recognition

2 January 2003

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

Vol. 4, 728-733
https://doi.org/10.1109/ijcnn.1992.227232

Abstract

The authors deal with increasing the amount of temporal information that can be extracted by a time delay neural network in speech recognition problems. In addition to input time windows, frequency windows are considered for connection to the hidden units. Frequency windows are included to extract more information, such as the change in the energy contents over time of speech data at the grass-root level of the network. The proposed approach was verified by designing an unvoiced stop consonant classifier and evaluating it with continuous speech. Results are shown to demonstrate the viability of the approach.

Keywords

This publication has 9 references indexed in Scilit:

Fast training algorithms for multilayer neural nets
IEEE Transactions on Neural Networks, 1991
Backpropagation through time: what it does and how to do it
Proceedings of the IEEE, 1990
Phoneme recognition using time-delay neural networks
IEEE Transactions on Acoustics, Speech, and Signal Processing, 1989
An introduction to computing with neural nets
IEEE ASSP Magazine, 1987
Parallel Distributed Processing
Published by MIT Press ,1986
The use of speech knowledge in automatic speech recognition
Proceedings of the IEEE, 1985
Time-varying features as correlates of place of articulation in stop consonants
The Journal of the Acoustical Society of America, 1983
Perceptual invariance and onset spectra for stop consonants in different vowel environments
The Journal of the Acoustical Society of America, 1980
Acoustic Properties of Stop Consonants
The Journal of the Acoustical Society of America, 1957