Word spotting in scanned images using hidden Markov models
- 1 January 1993
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. 5, 1-4 vol.5
- https://doi.org/10.1109/icassp.1993.319732
Abstract
A hidden-Markov-model (HMM)-based system for font-independent spotting of user-specified keywords in a scanned image is described. Word bounding boxes of potential keywords are extracted from the image using a morphology-based preprocessor. Feature vectors based on the external shape and internal structure of the word are computed over vertical columns of pixels in a word bounding box. For each user-specified keyword, an HMM is created by concatenating appropriate context-dependent character HMMs. Nonkeywords are modeled using an HMM based on context-dependent subcharacter models. Keyword spotting is performed using a Viterbi search through the HMM network created by connecting the keyword and nonkeyword HMMs in parallel. Applications of word-image spotting include information filtering in images from facsimile and copy machines, and information retrieval from text image databases.Keywords
This publication has 6 references indexed in Scilit:
- Speaker stress-resistant continuous speech recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Connected and degraded text recognition using hidden Markov modelPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- A hidden Markov model based keyword recognition systemPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Training and search algorithms for an interactive wordspotting systemPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1992
- Handwritten word recognition using HMM with adaptive length Viterbi algorithmPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1992
- On the Recognition of Printed Characters of Any Font and SizePublished by Institute of Electrical and Electronics Engineers (IEEE) ,1987