Character string extraction by multi-stage relaxation

Abstract
An extraction algorithm for character strings is proposed. We first obtain a set of eight-connected components from a document image. For the components, we apply a relaxation method. The method makes mutual connections between components increase or decrease depending on the state of the neighboring components. While applying the relaxation method several times, the process proceeds from a local connection to a global connection, and finally character strings are extracted. We call this process multi stage relaxation. The advantages of this algorithm are that it does not need to nominate character components from an image beforehand, it is adaptive for character size and font, and it can also cope with a document which includes strings with various orientations. In our experiments we use a color image of a magazine cover and a monochromatic image of a graph. For the color image, the multi stage relaxation was executed for each binary image obtained by color segmentation. Lastly, we show the results of the experiments and discuss the effectiveness of our method.

This publication has 5 references indexed in Scilit: