Receiver operating characteristic analysis for intelligent medical systems-a new approach for finding confidence intervals

1 July 2000

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Biomedical Engineering

Vol. 47 (7) , 952-963
https://doi.org/10.1109/10.846690

Abstract

Intelligent systems are increasingly being deployed in medicine and healthcare, but there is a need for a robust and objective methodology for evaluating such systems. Potentially, receiver operating characteristic (ROC) analysis could form a basis for the objective evaluation of intelligent medical systems. However, it has several weaknesses when applied to the types of data used to evaluate intelligent medical systems. First, small data sets are often used, which are unsatisfactory with existing methods. Second, many existing ROC methods use parametric assumptions which may not always be valid for the test cases selected. Third, system evaluations are often more concerned with particular, clinically meaningful, points on the curve, rather than on global indexes such as the more commonly used area under the curve. A novel, robust and accurate method is proposed, derived from first principles, which calculates the probability density function (pdf) for each point on a ROC curve for any given sample size. Confidence intervals are produced as contours on the pdf. The theoretical work has been validated by Monte Carlo simulations. It has also been applied to two real-world examples of ROC analysis, taken from the literature (classification of mammograms and differential diagnosis of pancreatic diseases), to investigate the confidence surfaces produced for real cases, and to illustrate how analysis of system performance can be enhanced. We illustrate the impact of sample size on system performance from analysis of ROC pdf's and 95% confidence boundaries. This work establishes an important new method for generating pdf's, and provides an accurate and robust method of producing confidence intervals for ROC curves for the small sample sizes typical of intelligent medical systems. It is conjectured that, potentially, the method could be extended to determine risks associated with the deployment of intelligent medical systems in clinical practice.

Keywords

This publication has 29 references indexed in Scilit:

Depth of anesthesia estimation and control [using auditory evoked potentials]
IEEE Transactions on Biomedical Engineering, 1999
Application of simulated annealing fuzzy model tuning to umbilical cord acid-base interpretation
IEEE Transactions on Fuzzy Systems, 1999
A novel approach to microcalcification detection using fuzzy logic technique
IEEE Transactions on Medical Imaging, 1998
Evaluation of the diagnostic performance of the expert EMG assistant MUNIN
Electroencephalography and Clinical Neurophysiology/Electromyography and Motor Control, 1996
A multicentre comparative study of 17 experts and an intelligent computer system for managing labour using the cardiotocogram
BJOG: An International Journal of Obstetrics and Gynaecology, 1995
Performance evaluation of medical expert systems using ROC curves
Computers and Biomedical Research, 1989
Measuring the Accuracy of Diagnostic Systems
Science, 1988
Efficient and portable combined random number generators
Communications of the ACM, 1988
Statistical Approaches to the Analysis of Receiver Operating Characteristic (ROC) Curves
Medical Decision Making, 1984
The area above the ordinal dominance graph and the area below the receiver operating characteristic graph
Journal of Mathematical Psychology, 1975