Performance metrics for document understanding systems
- 30 December 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
Requirements for the objective evaluation of automated data-entry systems are presented. Because the cost of correcting errors dominates the document conversion process, the most important characteristic of an OCR device is accuracy. However, different measures of accuracy (error metrics) are appropriate for different applications, and at the character, word, text-line, text-block, and document levels. For wholly objective assessment, OCR devices must be tested under programmed, rather than interactive, control.Keywords
This publication has 4 references indexed in Scilit:
- Omnidocument technologiesProceedings of the IEEE, 1992
- Document Image Defect ModelsPublished by Springer Nature ,1992
- The String-to-String Correction ProblemJournal of the ACM, 1974
- On optimum recognition error and reject tradeoffIEEE Transactions on Information Theory, 1970