Convolutional density estimation in hidden Markov models for speech recognition
- 1 January 1999
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. 1 (15206149) , 113-116 vol.1
- https://doi.org/10.1109/icassp.1999.758075
Abstract
In continuous density hidden Markov models (HMMs) for speech recognition, the probability density function (PDF) for each state is usually expressed as a mixture of Gaussians. We present a model in which the PDF is expressed as the convolution of two densities. We focus on the special case where one of the convolved densities is a M-Gaussian mixture, and the other is a mixture of N impulses. We present the reestimation formulae for the parameters of the M/spl times/N convolutional model, and suggest two ways for initializing them, the residual K-Means approach, and the deconvolution from a standard HMM with MN Gaussians per state using a genetic algorithm to search for the optimal assignment of Gaussians. Both methods result in a compact representation that requires only /spl Oscr/(M+N) storage space for the model parameters, and O(MN) time for training and decoding. We explain how the decoding time can be reduced to O(M+kN), where k<M. Finally, results are shown on the 1996 Hub-4 Development test, demonstrating that a 32/spl times/2 convolutional model can achieve performance comparable to that of a standard 64-Gaussian per state model.Keywords
This publication has 3 references indexed in Scilit:
- Parametric trajectory models for speech recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- A segmental speech model with applications to word spottingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1993
- A tutorial on hidden Markov models and selected applications in speech recognitionProceedings of the IEEE, 1989