A comparative study on the performance of several speech recognition techniques applied on the highly confusing mandarin syllables

1 September 1989

journal article
research article
Published by Taylor & Francis in Journal of the Chinese Institute of Engineers

Vol. 12 (6) , 705-713
https://doi.org/10.1080/02533839.1989.9677213

Abstract

In this paper, the performance of several speech recognition techniques applied on the highly confusing Mandarin syllables were carefully compared, including dynamic time warping (DTW), the newly proposed DTW with superimposed weighting function (DTWW), the discrete hidden Markov models (DHMM) and the continuous hidden Markov models (CHMM). The vocabulary used here consists of 409 first tone isolated Mandarin syllables. Due to the fact that many confusing sets exist in this vocabulary, the accurate recognition of these syllables is relatively difficult, and all the recognition experiments were performed in the speaker dependent mode. After a series of 13 experiments, it was found that the recognition rate of the newly proposed DTWW (88.3) is higher than that of DTW (85.1), DHMM (65.0) and CHMM (83.9), and that the CPU time used for DTWW is 1.03 times that for DTW, 24 times that for DHMM and 4.3 times that for CHMM. In addition, the memory space required for DTWW and DTW is 3.4 times that of DHMM and 8.5 times that of CHMM. Therefore, DTWW has the highest recognition rate, DHMM has the fastest recognition speed, whereas CHMM appears to be very attractive when all the different factors including recognition rate, recognition speed and memory space requirement are considered.

Keywords

This publication has 12 references indexed in Scilit:

On the use of bandpass liftering in speech recognition
IEEE Transactions on Acoustics, Speech, and Signal Processing, 1987
Mixture autoregressive hidden Markov models for speech signals
IEEE Transactions on Acoustics, Speech, and Signal Processing, 1985
Recognition of Isolated Digits Using Hidden Markov Models With Continuous Mixture Densities
AT&T Technical Journal, 1985
Vector quantization
IEEE ASSP Magazine, 1984
An Introduction to the Application of the Theory of Probabilistic Functions of a Markov Process to Automatic Speech Recognition
Bell System Technical Journal, 1983
On the Application of Vector Quantization and Hidden Markov Models to Speaker-Independent, Isolated Word Recognition
Bell System Technical Journal, 1983
Isolated and Connected Word Recognition--Theory and Selected Applications
IEEE Transactions on Communications, 1981
Linear Prediction of Speech
Published by Springer Nature ,1976
An Algorithm for Determining the Endpoints of Isolated Utterances
Bell System Technical Journal, 1975
Minimum prediction residual principle applied to speech recognition
IEEE Transactions on Acoustics, Speech, and Signal Processing, 1975