An LVQ based reference model for speaker-adaptive speech recognition

1 January 1992

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

Vol. 1 (15206149) , 441-444 vol.1
https://doi.org/10.1109/icassp.1992.225877

Abstract

A novel type of hierarchical phoneme model for speaker adaptation, based on both hidden Markov models (HMM) and learned vector quantization (LVQ) networks is presented. Low-level tied LVQ phoneme models are trained speaker-dependently and independently, yielding a pool of speaker-biased phoneme models which can be mixed into high-level speaker-adaptive phoneme models. Rapid speaker adaptation is performed by finding an optimal mixture for these models at recognition time, given only a small amount of speech data; subsequently, the models are fine-tuned to the new speaker's voice by further parameter reestimation. In preliminary experiments with a continuous speech task using 40 context-free phoneme models at task perplexity 111, the authors achieved 82% word accuracy for speaker-dependent recognition and 73% in the speaker-adaptive mode.

Keywords

This publication has 6 references indexed in Scilit:

The Meta-Pi network: connectionist rapid adaptation for high-performance multi-speaker phoneme recognition
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
A new paradigm for speaker-independent training
Published by Institute of Electrical and Electronics Engineers (IEEE) ,1991
Speaker-independent large vocabulary word recognition using an LVQ/HMM hybrid algorithm
Published by Institute of Electrical and Electronics Engineers (IEEE) ,1991
Continuous speech recognition using linked predictive neural networks
Published by Institute of Electrical and Electronics Engineers (IEEE) ,1991
Statistical pattern recognition with neural networks: benchmarking studies
Published by Institute of Electrical and Electronics Engineers (IEEE) ,1988
Structural methods in automatic speech recognition
Proceedings of the IEEE, 1985