Putting it all together: language model combination
- 7 November 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. 3, 1647-1650 vol.3
- https://doi.org/10.1109/icassp.2000.862064
Abstract
In the past several years, a number of different language modeling improvements over simple trigram models have been found, including caching, higher-order n-grams, skipping, modified Kneser-Ney smoothing and clustering. While all of these techniques have been studied separately, they have rarely been studied in combination. We find some significant interactions, especially with smoothing techniques. The combination of all techniques leads to up to a 45% perplexity reduction over a Katz (1987) smoothed trigram model with no count cutoffs, the highest such perplexity reduction reported.Keywords
This publication has 8 references indexed in Scilit:
- An empirical study of smoothing techniques for language modelingComputer Speech & Language, 1999
- Modeling long distance dependence in language: topic mixtures versus dynamic cache modelsIEEE Transactions on Speech and Audio Processing, 1999
- Multi-class composite N-gram based on connection directionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1999
- On structuring probabilistic dependences in stochastic language modellingComputer Speech & Language, 1994
- A Hybrid Approach to Adaptive Statistical Language ModelingPublished by Defense Technical Information Center (DTIC) ,1994
- The SPHINX-II speech recognition system: an overviewComputer Speech & Language, 1993
- A cache-based natural language model for speech recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1990
- Estimation of probabilities from sparse data for the language model component of a speech recognizerIEEE Transactions on Acoustics, Speech, and Signal Processing, 1987