Maximum likelihood weighting of dynamic speech features for CDHMM speech recognition
- 1 January 1997
- proceedings article
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. 2, 1267-1270
- https://doi.org/10.1109/icassp.1997.596176
Abstract
Speech dynamic features are routinely used in current speech recognition systems in combination with short-term (static) spectral features. Although many existing speech recognition systems do not weight both kinds of features, it seems convenient to use some weighting in order to increase the recognition accuracy of the system. In the cases that this weighting is performed, it is manually tuned or it consists simply in compensating the variances. The aim of this paper is to propose a method to automatically estimate an optimum state-dependent stream weighting in a continuous density hidden Markov model (CDHMM) recognition system by means of a maximum-likelihood based training algorithm. Unlike other works, it is shown that simple constraints on the new weighting parameters permit to apply the maximum-likelihood criterion to this problem. Experimental results in speaker independent digit recognition show an important increase of recognition accuracy.Peer ReviewedPostprint (published versionKeywords
This publication has 8 references indexed in Scilit:
- A database for speaker-independent digit recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- A study of speech recognition for children and the elderlyPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Learning state-dependent stream weights for multi-codebook HMM speech recognition systemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Maximum mutual information estimation of HMM parameters for continuous speech recognition using the N-best algorithmPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Optimization of speech parameter weighting for CDHMM word recognitionPublished by International Speech Communication Association ,1995
- High-performance connected digit recognition using maximum mutual information estimationIEEE Transactions on Speech and Audio Processing, 1994
- Discriminative feature selection for speech recognitionComputer Speech & Language, 1993
- Speaker-independent isolated word recognition using dynamic features of speech spectrumIEEE Transactions on Acoustics, Speech, and Signal Processing, 1986