Evaluating the performance of detection algorithms in digital mammography
- 12 February 1999
- journal article
- Published by Wiley in Medical Physics
- Vol. 26 (2) , 267-275
- https://doi.org/10.1118/1.598514
Abstract
The initial and relative evaluation of computer methodologies developed for assisting diagnosis in mammography is usually done by comparing the computer output to ground truth data provided by experts and/or biopsy. Reported studies, however, give little information on how the performance indices of computer assisted diagnosis (CAD) algorithms are determined in this initial stage of evaluation. Several strategies exist in the estimation of the true positive (TP) and false positive (FP) rates with respect to ground truth. Adopting one strategy over another yields different performance rates that can be over‐ or underestimates of the true performance. Furthermore, the estimation of pairs of TP and FP rates gives a partial picture of the performance of an algorithm. It is shown in this work that new performance indices are needed to fully describe the degree of detection (part or whole) and the type of detection (single calcification, cluster of calcifications, mass, or artifact). Several evaluation strategies were tested. The one that yielded the most realistic performances included the following criteria: The detected area should be at least 50% of the true area and no more than four times the true area in order to be considered TP. At least three true calcifications should be detected to within with nearest neighbor distances of less than √2 cm for a cluster to be considered TP. Separate detection measures should be established and used for artifacts and naturally occurring structures to maximize the benefits of the evaluation. Finally, it is critical that CAD investigators provide information on the tested image set as well as the criteria used for the evaluation of the algorithms to allow comparisons and better understanding of their methodologies.Keywords
This publication has 26 references indexed in Scilit:
- The Effect of Data Sampling on the Performance Evaluation of Artificial Neural Networks in Medical DiagnosisMedical Decision Making, 1997
- An adaptive density-weighted contrast enhancement filter for mammographic breast mass detectionIEEE Transactions on Medical Imaging, 1996
- Detection of stellate distortions in mammogramsIEEE Transactions on Medical Imaging, 1996
- Computer-aided diagnosis of breast cancer: Artificial neural network approach for optimized merging of mammographic featuresAcademic Radiology, 1995
- Markov random field for tumor detection in digital mammographyIEEE Transactions on Medical Imaging, 1995
- Potential usefulness of digital imaging in clinical diagnostic radiology: Computer-aided diagnosisJournal of Digital Imaging, 1995
- Computer-aided detection of clustered microcalcifications: An improved method for grouping detected signalsMedical Physics, 1993
- Mammographic feature analysisSeminars in Roentgenology, 1993
- Automatic computer detection of clustered calcifications in digital mammogramsPhysics in Medicine & Biology, 1990
- On techniques for detecting circumscribed masses in mammogramsIEEE Transactions on Medical Imaging, 1989