Page segmentation using texture discrimination masks

19 November 2002

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

Vol. 3, 308-311
https://doi.org/10.1109/icip.1995.538546

Abstract

We propose a new texture-based page segmentation algorithm which automatically extracts the text, halftone, and line-drawing regions from input greyscale document images. This approach utilizes a neural network to train a set of masks which is optimal for discriminating the three main texture classes in the page segmentation problem: halftone, background, and text and line-drawing regions. The test and line-drawing regions are further discriminated based on connectivity analysis. We have applied the algorithm to successfully segment English and Chinese document images. We also demonstrate that the masks can perform language separation (English/Chinese) when appropriately trained.

This publication has 3 references indexed in Scilit:

CD-ROM document database standard
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
A prototype document image analysis system for technical journals
Computer, 1992
Text segmentation using gabor filters for automatic document processing
Machine Vision and Applications, 1992