Word predictability after hesitations: a corpus-based study
- 24 December 2002
- proceedings article
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. 3, 1868-1871
- https://doi.org/10.1109/icslp.1996.607996
Abstract
We ask whether lexical hesitations in spontaneous speech tend to precede words that are difficult to predict. We define predictability in terms of both transition probability and entropy, in the context of an N-gram language model. Results show that transition probabil- ity is significantly lower at hesitation transitions, and that this is attributable to both the following word and the word history. In addition, results suggest that fluent transitions in sentences with a hesitation elsewhere are significantly more likely than transitions in f luent sentences to contain out-of-vocabulary words and novel word combinations. Such findings could be used to improve statis- tical language modeling for spontaneous-speech applications.Keywords
This publication has 2 references indexed in Scilit:
- Statistical language modeling for speech disfluenciesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- SWITCHBOARD: telephone speech corpus for research and developmentPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1992