Benchmarking page segmentation algorithms
- 1 January 1994
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- No. 10636919,p. 411-416
- https://doi.org/10.1109/cvpr.1994.323859
Abstract
A method for automatically evaluating the quality of document page segmentation algorithms is introduced. Many different zoning techniques are now available but there is no robust method available to benchmark and evaluate them reliably. Our proposed strategy is a region-based approach, in which segmentation results are compared with manually generated "ground truth files", describing all possible correct segmentations. A segmentation ground truthing scheme has been proposed. The evaluation of segmentation quality is achieved by testing the overlap between the two sets of regions. In fact, the regions are defined as the "black" pixels contained in the extracted polygons. An explicit specification of segmentation errors and a numerical evaluation are derived. The algorithm is simple and fast, and provides a multi-level output for each segmentation.Keywords
This publication has 4 references indexed in Scilit:
- Performance metrics for document understanding systemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Image segmentation by shape-directed coversPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Page segmentation and classificationCVGIP: Graphical Models and Image Processing, 1992
- Document Image Defect ModelsPublished by Springer Nature ,1992