Buried Markov models for speech recognition
- 1 January 1999
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. 2 (15206149) , 713-716 vol.2
- https://doi.org/10.1109/icassp.1999.759766
Abstract
Good HMM-based speech recognition performance requires at most minimal inaccuracies to be introduced by HMM conditional independence assumptions. In this work, HMM conditional independence assumptions are relaxed in a principled way. For each hidden state value, additional dependencies are added between observation elements to increase both accuracy and discriminability. These additional dependencies are chosen according to natural statistical dependencies extant in training data that are not well modeled by an HMM. The result is called a buried Markov model (BMM) because the underlying Markov chain in an HMM is further hidden (buried) by specific cross-observation dependencies. Gaussian mixture HMMs are extended to represent BMM dependencies and new EM update equations are derived. On preliminary experiments with a large-vocabulary isolated-word speech database, BMMs are able to achieve an 11% improvement in WER with only a 9.5% increase in the number of parameters using a single state per mono-phone speech recognition system.Keywords
This publication has 10 references indexed in Scilit:
- Explicit time correlation in hidden Markov models for speech recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Linear predictive hidden Markov models and the speech signalPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Hybrid HMM/ANN systems for training independent tasks: experiments on Phonebook and related improvementsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- PhoneBook: a phonetically-rich isolated-word telephone-speech databasePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Factorial Hidden Markov ModelsMachine Learning, 1997
- From HMM's to segment models: a unified view of stochastic modeling for speech recognitionIEEE Transactions on Speech and Audio Processing, 1996
- Graphical ModelsPublished by Oxford University Press (OUP) ,1996
- Connectionist Speech RecognitionPublished by Springer Nature ,1994
- Could information theory provide an ecological theory of sensory processing?Network: Computation in Neural Systems, 1992
- A linear predictive HMM for vector-valued observations with applications to speech recognitionIEEE Transactions on Acoustics, Speech, and Signal Processing, 1990