Abstract
The authors apply a hidden Markov model (HMM) and a level-building dynamic programming algorithm to the problem of robust machine recognition of connected and degraded characters forming words in a poorly printed text. A structural analysis algorithm is used to segment a word into sub-character segments irrespective of the character boundaries, and to identify the primitive features in each segment such as strokes and arcs. The states of the HMM for each character are statistically represented by the sub-character segments and the state characteristics are obtained by determining the state probability functions based on the training samples. A level-building dynamic programming algorithm combines word-segmentation and recognition in one operation and chooses the best probable grouping of characters for recognition of an unknown word. The computer experiments demonstrate the robustness and effectiveness of the system for recognizing words formed by degraded and connected characters.

This publication has 7 references indexed in Scilit: