Character string extraction by multi-stage relaxation
- 22 November 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. 1, 298-302 vol.1
- https://doi.org/10.1109/icdar.1997.619860
Abstract
An extraction algorithm for character strings is proposed. We first obtain a set of eight-connected components from a document image. For the components, we apply a relaxation method. The method makes mutual connections between components increase or decrease depending on the state of the neighboring components. While applying the relaxation method several times, the process proceeds from a local connection to a global connection, and finally character strings are extracted. We call this process multi stage relaxation. The advantages of this algorithm are that it does not need to nominate character components from an image beforehand, it is adaptive for character size and font, and it can also cope with a document which includes strings with various orientations. In our experiments we use a color image of a magazine cover and a monochromatic image of a graph. For the color image, the multi stage relaxation was executed for each binary image obtained by color segmentation. Lastly, we show the results of the experiments and discuss the effectiveness of our method.Keywords
This publication has 5 references indexed in Scilit:
- The document spectrum for page layout analysisPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1993
- Automated entry system for printed documentsPattern Recognition, 1990
- A robust algorithm for text string separation from mixed text/graphics imagesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1988
- Document Analysis SystemIBM Journal of Research and Development, 1982
- Color image quantization for frame buffer displayACM SIGGRAPH Computer Graphics, 1982