An experimental page layout recognition system for office document automatic classification: an integrated approach for inductive generalization
- 1 January 1990
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. i, 557-562
- https://doi.org/10.1109/icpr.1990.118164
Abstract
A novel approach to automatic classification of digitized office documents based on the inductive generalization of their layout style, is presented. It is supported by the observation that for a number of printed documents it is possible to find a set of relevant and invariant layout features. These are geometrical characteristics automatically detected through a segmentation and layout analysis process. The learning step, in which significant examples of document classes are used to train the classification system, involves the novel idea of integrating parametric (numerical) and conceptual (symbolic) learning methods.<>Keywords
This publication has 5 references indexed in Scilit:
- Letter pattern recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- HIGH LEVEL DOCUMENT ANALYSIS GUIDED BY GEOMETRIC ASPECTSInternational Journal of Pattern Recognition and Artificial Intelligence, 1988
- Knowledge based document classification supporting integrated document handlingPublished by Association for Computing Machinery (ACM) ,1988
- Document Analysis SystemIBM Journal of Research and Development, 1982
- Pattern Recognition as Rule-Guided Inductive InferencePublished by Institute of Electrical and Electronics Engineers (IEEE) ,1980