Connected-digit speaker-dependent speech recognition using a neural network with time-delayed connections
- 1 March 1991
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Signal Processing
- Vol. 39 (3) , 698-713
- https://doi.org/10.1109/78.80888
Abstract
An analog neural network that can be taught to recognize stimulus sequences is used to recognize the digits in connected speech. The circuit computes in the analog domain, using linear circuits for signal filtering and nonlinear circuits for simple decisions, feature extraction, and noise suppression. An analog perceptron learning rule is used to organize the subset of connections used in the circuit that are specific to the chosen vocabulary. Computer simulations of the learning algorithm and circuit demonstrate recognition scores >99 % for a single-speaker connected-digit data base. There is no clock. The circuit is data driven, and there is no necessity for endpoint detection or segmentation of the speech signal during recognition. Training in the presence of noise provides noise immunity up to the trained level. For the speech problem studied, the circuit connections need only be accurate to about 3-b digitization depth for optimum performance. The algorithm used maps efficiently onto analog neutral network hardware.Keywords
This publication has 21 references indexed in Scilit:
- An analog electronic cochleaIEEE Transactions on Acoustics, Speech, and Signal Processing, 1988
- Learning the hidden structure of speechThe Journal of the Acoustical Society of America, 1988
- Network-based connected digit recognitionIEEE Transactions on Acoustics, Speech, and Signal Processing, 1987
- Spectral and temporal response patterns of single units in the chinchilla dorsal cochlear nucleusExperimental Neurology, 1987
- Learning representations by back-propagating errorsNature, 1986
- Computing with Neural Circuits: A ModelScience, 1986
- Speech processing in the auditory system II: Lateral inhibition and the central processing of speech evoked activity in the auditory nerveThe Journal of the Acoustical Society of America, 1985
- On the effects of varying filter bank parameters on isolated word recognitionIEEE Transactions on Acoustics, Speech, and Signal Processing, 1983
- A level building dynamic time warping algorithm for connected word recognitionIEEE Transactions on Acoustics, Speech, and Signal Processing, 1981
- Minimum prediction residual principle applied to speech recognitionIEEE Transactions on Acoustics, Speech, and Signal Processing, 1975