Postprocessing statistical language models for handwritten Chinese character recognizer

1 April 1999

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)

Vol. 29 (2) , 286-291
https://doi.org/10.1109/3477.752802

Abstract

Two statistical language models have been investigated on their effectiveness in upgrading the accuracy of a Chinese character recognizer. The baseline model is one of lexical analytic nature which segments a sequence of character images according to the maximum matching of words with consideration of word binding forces. A model of bigram statistics of word-classes is then investigated and compared against the baseline model in terms of recognition rate improvement on the image recognizer. On the average, the baseline language model improves the recognition rate by about 7% while the bigram statistics model upgrades it by about 10%.

Keywords

This publication has 4 references indexed in Scilit:

Combining stochastic and linguistic language models for recognition of spontaneous speech
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Golden Mandarin(II)-an intelligent Mandarin dictation machine for Chinese character input with adaptation/learning functions
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Contextual vector quantization modeling of hand-printed Chinese character recognition
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Chinese word segmentation based on maximum matching and word binding force
Published by Association for Computational Linguistics (ACL) ,1996