Receiver operating characteristic analysis for intelligent medical systems-a new approach for finding confidence intervals
- 1 July 2000
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Biomedical Engineering
- Vol. 47 (7) , 952-963
- https://doi.org/10.1109/10.846690
Abstract
Intelligent systems are increasingly being deployed in medicine and healthcare, but there is a need for a robust and objective methodology for evaluating such systems. Potentially, receiver operating characteristic (ROC) analysis could form a basis for the objective evaluation of intelligent medical systems. However, it has several weaknesses when applied to the types of data used to evaluate intelligent medical systems. First, small data sets are often used, which are unsatisfactory with existing methods. Second, many existing ROC methods use parametric assumptions which may not always be valid for the test cases selected. Third, system evaluations are often more concerned with particular, clinically meaningful, points on the curve, rather than on global indexes such as the more commonly used area under the curve. A novel, robust and accurate method is proposed, derived from first principles, which calculates the probability density function (pdf) for each point on a ROC curve for any given sample size. Confidence intervals are produced as contours on the pdf. The theoretical work has been validated by Monte Carlo simulations. It has also been applied to two real-world examples of ROC analysis, taken from the literature (classification of mammograms and differential diagnosis of pancreatic diseases), to investigate the confidence surfaces produced for real cases, and to illustrate how analysis of system performance can be enhanced. We illustrate the impact of sample size on system performance from analysis of ROC pdf's and 95% confidence boundaries. This work establishes an important new method for generating pdf's, and provides an accurate and robust method of producing confidence intervals for ROC curves for the small sample sizes typical of intelligent medical systems. It is conjectured that, potentially, the method could be extended to determine risks associated with the deployment of intelligent medical systems in clinical practice.Keywords
This publication has 29 references indexed in Scilit:
- Depth of anesthesia estimation and control [using auditory evoked potentials]IEEE Transactions on Biomedical Engineering, 1999
- Application of simulated annealing fuzzy model tuning to umbilical cord acid-base interpretationIEEE Transactions on Fuzzy Systems, 1999
- A novel approach to microcalcification detection using fuzzy logic techniqueIEEE Transactions on Medical Imaging, 1998
- Evaluation of the diagnostic performance of the expert EMG assistant MUNINElectroencephalography and Clinical Neurophysiology/Electromyography and Motor Control, 1996
- A multicentre comparative study of 17 experts and an intelligent computer system for managing labour using the cardiotocogramBJOG: An International Journal of Obstetrics and Gynaecology, 1995
- Performance evaluation of medical expert systems using ROC curvesComputers and Biomedical Research, 1989
- Measuring the Accuracy of Diagnostic SystemsScience, 1988
- Efficient and portable combined random number generatorsCommunications of the ACM, 1988
- Statistical Approaches to the Analysis of Receiver Operating Characteristic (ROC) CurvesMedical Decision Making, 1984
- The area above the ordinal dominance graph and the area below the receiver operating characteristic graphJournal of Mathematical Psychology, 1975