Fast speaker adaptation combined with soft vector quantization in an HMM speech recognition system
- 1 January 1992
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. 1 (15206149) , 461-464 vol.1
- https://doi.org/10.1109/icassp.1992.225872
Abstract
The authors describe a method for combining speaker adaptation by feature vector transformation with semi-continuous hidden Markov modeling (SCHMM). Since the reference speaker's voice is represented in the SCHMM system by multidimensional Gaussian distributions, it is these distributions rather than feature vectors that must be transformed. The performance of hard-decision vector quantization (HVQ), soft-decision VQ (SVQ), and SCHMM are compared as are the speaker-adaptive and speaker-independent systems. In addition, the influence of dynamic features is investigated. The definition of subword units is optimized, and, with respect to full or diagonal covariance matrices and codebook size, the SCHMM system is optimized. Model initialization and distribution reestimation during training is introduced. Significant improvements are obtained compared to previously reported systems based on HVQ: from 71.6% to 84.6% (speaker-independent) and from 80.4% to 87.4% (speaker-adaptive) mean recognition rate under difficult conditions.Keywords
This publication has 4 references indexed in Scilit:
- Speaker adaptation for recognition systems with a large vocabularyPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Fast speaker adaptation for speech recognition systemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Soft-decision vector quantization based on the Dempster/Shafer theoryPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1991
- Semi-continuous hidden Markov models for speech signalsComputer Speech & Language, 1989