On the evaluation of document analysis components by recall, precision, and accuracy
- 1 January 1999
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
In document analysis, it is common to prove the usefulness of a component by an experimental evaluation. By applying the respective algorithms to a test sample, effectiveness measures such as recall, precision, and accuracy are computed. The goal of such an evaluation is two-fold: on the one hand it shows that the absolute effectiveness of the algorithm is acceptable for practical use. On the other hand the evaluation can prove that the algorithm has a better or worse effectiveness than another algorithm. We argue that the experimental evaluation on relative small test sets-as is very common in document analysis has to be taken with extreme care from a statistical point of view. In fact, it is surprising how weak statements derived from such evaluations are.Keywords
This publication has 5 references indexed in Scilit:
- An experimental evaluation of OCR text representations for learning document classifiersInternational Journal on Document Analysis and Recognition (IJDAR), 1998
- Handbook Of Character Recognition and Document Image AnalysisPublished by World Scientific Pub Co Pte Ltd ,1997
- The Logic of Inductive InferenceJournal of the Royal Statistical Society, 1935
- THE USE OF CONFIDENCE OR FIDUCIAL LIMITS ILLUSTRATED IN THE CASE OF THE BINOMIALBiometrika, 1934
- Inverse ProbabilityMathematical Proceedings of the Cambridge Philosophical Society, 1930