Assessment of diagnostic markers by goodness‐of‐fit tests

14 July 2003

journal article
research article
Published by Wiley in Statistics in Medicine

Vol. 22 (15) , 2503-2513
https://doi.org/10.1002/sim.1464

Abstract

Receiver operating characteristic (ROC) curves are useful statistical tools used to assess the precision of diagnostic markers or to compare new diagnostic markers with old ones. The most common index employed for these purposes is the area under the ROC curve (θ) and several statistical tests exist that test the null hypotheses H₀: θ= 0.5 or H₀: θ1=θ2, in the case of two‐marker comparisons, against alternatives of interest. In this paper we show that goodness‐of‐fit of uniformity of the distribution of the false positive (true positive) rates can be used instead of tests based on the area index. A semi‐parametric approach is based on a completely specified distribution of marker measurements for either the healthy (F) or diseased (G) subjects, and this is extended to the two‐marker case. We then extend to the one‐ and two‐marker case when neither distribution is specified (the non‐parametric case). In general, ROC‐based tests are more powerful than goodness‐of‐fit tests for location differences between the distributions of healthy and diseased subjects. However ROC‐based tests are less powerful when location‐scale differences exist (producing ROC curves that cross the diagonal) and are incapable of discriminating between healthy and diseased samples when θ=0.5 but F ≠ G. In these cases, goodness‐of‐fit tests have a distinct advantage over ROC‐based tests. In conclusion, ROC methodology should be used with recognition of its potential limitations and should be replaced by goodness‐of‐fit tests when appropriate. The latter are a viable alternative and can be used as a ‘black box’ or as an exploratory first step in the evaluation of novel diagnostic markers. Copyright © 2003 John Wiley & Sons, Ltd.

Keywords

This publication has 23 references indexed in Scilit:

Rank Statistics Expressible as Integrals Under P–P-plots and Receiver Operating Characteristic Curves
Journal of the Royal Statistical Society Series B: Statistical Methodology, 2000
Semiparametric inference for a quantile comparison function with applications to receiver operating characteristic curves
Biometrika, 1999
Nonparametric and semiparametric estimation of the receiver operating characteristic curve
The Annals of Statistics, 1996
Confidence Bands for Receiver Operating Characteristic Curves
Medical Decision Making, 1993
A family of nonparametric statistics for comparing diagnostic markers with paired or unpaired data
Biometrika, 1989
A test for crossing receiver operating characteristic (roc) curves
Communications in Statistics - Theory and Methods, 1988
A Leisurely Look at the Bootstrap, the Jackknife, and Cross-Validation
The American Statistician, 1983
Maximally Selected Chi Square Statistics
Published by JSTOR ,1982
The area above the ordinal dominance graph and the area below the receiver operating characteristic graph
Journal of Mathematical Psychology, 1975
A Test of Goodness of Fit
Journal of the American Statistical Association, 1954