Toward island-of-reliability-driven very-large-vocabulary on-line handwriting recognition using character confidence scoring
- 13 November 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. 3 (15206149) , 1525-1528
- https://doi.org/10.1109/icassp.2001.941222
Abstract
We explore a novel approach for handwriting recognition tasks whose intrinsic vocabularies are too large to be applied directly as constraints during recognition. Our approach makes use of vocabulary constraints, and addresses the issue that some parts of words may be written more recognizably than others. An initial pass is made with an HMM recognizer, without vocabulary constraints, generating a lattice of character-hypothesis arcs representing likely segmentations of the handwriting signal. Arc confidence scores are computed using a posteriori probabilities. The most confidently recognized characters are used to filter the overall vocabulary, generating a word subset manageable for constraining a second recognition pass. With a vocabulary of 273000 words, we can limit to 50000 words in the second pass and eliminate 39.3% of the word errors made by a one-pass recognizer without vocabulary constraints, and 18.3% of errors made using a fixed 30000-word set.Keywords
This publication has 8 references indexed in Scilit:
- Writer dependent recognition of on-line unconstrained handwritingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Real-time on-line unconstrained handwriting recognition using statistical methodsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Robust confidence annotation and rejection for continuous speech recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Large vocabulary decoding and confidence estimation using word posterior probabilitiesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Finding consensus among words: lattice-based word error minimizationPublished by International Speech Communication Association ,1999
- Estimating confidence using word latticesPublished by International Speech Communication Association ,1997
- Explicit word error minimization in n-best list rescoringPublished by International Speech Communication Association ,1997
- UNIPEN project of on-line data exchange and recognizer benchmarksPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1994