Abstract
There is increasing interest in systems which attempt to automate a task or a transaction using speech input and output. To function effectively with imperfect speech recognition, such systems require an estimate of which words in the output from the recogniser are likely to be correct and which can probably be disregarded as incorrect, i.e. a confidence-measure for each decoded word. We define a measure for evaluating the effectiveness of a post-classifier which estimates confidence-measures, and describe the development of a post-classifier for words decoded from the SWITCHBOARD database, which uses statistics derived from a Viterbi decoder. Without any grouping of the decoded word-classes, the post-classifier increased the probability of deciding whether a decoded word was correct or incorrect by 32%. When grouping was used, longer words showed an improvement of 65%.

This publication has 4 references indexed in Scilit: