Comparing the Areas under More Than Two Independent ROC Curves

Abstract
For a diagnostic test, the area under the associated receiver operating characteristic (ROC) curve is considered a measure of the efficacy of the test. Statistical methodology for the comparison of the areas under more than two independent ROC curves is developed. The jackknife is used to devise an F test using the pseudovalues as data. A Studentized range (SR) test is also considered using the original area estimates. A Monte Carlo study is performed to evaluate the significance level and power of the two test statistics. Both statistics conform well to the 0.10, 0.05, and 0.01 significance levels when the sampling design is balanced between cases with and without the disease. Power is also comparable. For unbalanced designs, the SR test on the original area estimates is very conservative while the F test on pseudovalues performs well. The F test is recommended as the method of choice for comparing the areas, although for balanced designs the SR test, with its com putational simplicity, may be preferred.