Comparison of discriminative training criteria
- 1 January 1998
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. 1 (15206149) , 493-496
- https://doi.org/10.1109/icassp.1998.674475
Abstract
A formally unifying approach for a class of discriminative training criteria including maximum mutual information (MMI) and minimum classification error (MCE) criterion is presented, together with the optimization methods of the gradient descent (GD) and extended Baum-Welch (EB) algorithm. Comparisons are discussed for the MMI and the MCE criterion, including the determination of the sets of word sequence hypotheses for discrimination using word graphs. Experiments have been carried out on the SieTill corpus for telephone line recorded German continuous digit strings. Using several approaches for acoustic modeling, the word error rates obtained by MMI training using single densities always were better than those for maximum likelihood (ML) using mixture densities. Finally, the results obtained for corrective training (CT), i.e. using only the best recognized word sequence in addition to the spoken word sequence, could not be improved by using the word graph based discriminative training.Keywords
This publication has 5 references indexed in Scilit:
- A comparative study of linear feature transformation techniques for automatic speech recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Lattice-based discriminative training for large vocabulary speech recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Maximum Mutual Information Estimation of Hidden Markov ModelsPublished by Springer Nature ,1996
- Minimum error rate training based on N-best string modelsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1993
- Linear discriminant analysis for improved large vocabulary continuous speech recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1992