Reading newspaper text

Abstract
The authors describe a method for segmenting a newspaper page image into labeled macro components (blocks) and recognizing the content. Connected component analysis is used to segment a newspaper image into several rectangular blocks and to filter connected components into character and noncharacter components. Textural analysis is then used to classify the remaining noncharacter components into graphics and photographs. Experimental results indicate that these techniques work very well.<>

This publication has 5 references indexed in Scilit: