A comparative study on the performance of several speech recognition techniques applied on the highly confusing mandarin syllables
- 1 September 1989
- journal article
- research article
- Published by Taylor & Francis in Journal of the Chinese Institute of Engineers
- Vol. 12 (6) , 705-713
- https://doi.org/10.1080/02533839.1989.9677213
Abstract
In this paper, the performance of several speech recognition techniques applied on the highly confusing Mandarin syllables were carefully compared, including dynamic time warping (DTW), the newly proposed DTW with superimposed weighting function (DTWW), the discrete hidden Markov models (DHMM) and the continuous hidden Markov models (CHMM). The vocabulary used here consists of 409 first tone isolated Mandarin syllables. Due to the fact that many confusing sets exist in this vocabulary, the accurate recognition of these syllables is relatively difficult, and all the recognition experiments were performed in the speaker dependent mode. After a series of 13 experiments, it was found that the recognition rate of the newly proposed DTWW (88.3) is higher than that of DTW (85.1), DHMM (65.0) and CHMM (83.9), and that the CPU time used for DTWW is 1.03 times that for DTW, 24 times that for DHMM and 4.3 times that for CHMM. In addition, the memory space required for DTWW and DTW is 3.4 times that of DHMM and 8.5 times that of CHMM. Therefore, DTWW has the highest recognition rate, DHMM has the fastest recognition speed, whereas CHMM appears to be very attractive when all the different factors including recognition rate, recognition speed and memory space requirement are considered.Keywords
This publication has 12 references indexed in Scilit:
- On the use of bandpass liftering in speech recognitionIEEE Transactions on Acoustics, Speech, and Signal Processing, 1987
- Mixture autoregressive hidden Markov models for speech signalsIEEE Transactions on Acoustics, Speech, and Signal Processing, 1985
- Recognition of Isolated Digits Using Hidden Markov Models With Continuous Mixture DensitiesAT&T Technical Journal, 1985
- Vector quantizationIEEE ASSP Magazine, 1984
- An Introduction to the Application of the Theory of Probabilistic Functions of a Markov Process to Automatic Speech RecognitionBell System Technical Journal, 1983
- On the Application of Vector Quantization and Hidden Markov Models to Speaker-Independent, Isolated Word RecognitionBell System Technical Journal, 1983
- Isolated and Connected Word Recognition--Theory and Selected ApplicationsIEEE Transactions on Communications, 1981
- Linear Prediction of SpeechPublished by Springer Nature ,1976
- An Algorithm for Determining the Endpoints of Isolated UtterancesBell System Technical Journal, 1975
- Minimum prediction residual principle applied to speech recognitionIEEE Transactions on Acoustics, Speech, and Signal Processing, 1975