Putting it all together: language model combination

7 November 2002

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

Vol. 3, 1647-1650 vol.3
https://doi.org/10.1109/icassp.2000.862064

Abstract

In the past several years, a number of different language modeling improvements over simple trigram models have been found, including caching, higher-order n-grams, skipping, modified Kneser-Ney smoothing and clustering. While all of these techniques have been studied separately, they have rarely been studied in combination. We find some significant interactions, especially with smoothing techniques. The combination of all techniques leads to up to a 45% perplexity reduction over a Katz (1987) smoothed trigram model with no count cutoffs, the highest such perplexity reduction reported.

Keywords

This publication has 8 references indexed in Scilit:

An empirical study of smoothing techniques for language modeling
Computer Speech & Language, 1999
Modeling long distance dependence in language: topic mixtures versus dynamic cache models
IEEE Transactions on Speech and Audio Processing, 1999
Multi-class composite N-gram based on connection direction
Published by Institute of Electrical and Electronics Engineers (IEEE) ,1999
On structuring probabilistic dependences in stochastic language modelling
Computer Speech & Language, 1994
A Hybrid Approach to Adaptive Statistical Language Modeling
Published by Defense Technical Information Center (DTIC) ,1994
The SPHINX-II speech recognition system: an overview
Computer Speech & Language, 1993
A cache-based natural language model for speech recognition
Published by Institute of Electrical and Electronics Engineers (IEEE) ,1990
Estimation of probabilities from sparse data for the language model component of a speech recognizer
IEEE Transactions on Acoustics, Speech, and Signal Processing, 1987