Estimating test power adjusted for size

1 November 2005

journal article
research article
Published by Taylor & Francis in Journal of Statistical Computation and Simulation

Vol. 75 (11) , 921-933
https://doi.org/10.1080/00949650412331321160

Abstract

Statisticians seek tests which have maximum power amongst tests of size α. In both numerical and theoretical studies, the standard approach is to compare the powers of competing tests which have the same nominal size α*. In most cases, α and α* differ; and in this case, the differing size biases of the tests contaminate any comparisons of their power. For instance, two nominal 5% tests with actual sizes 4% and 6% should not have their powers naively compared. In this paper, the basic problem of trading-off size for power is approached through the existing theory of receiver operating characteristic curves. This leads us to a simple way of estimating power adjusted for size, not only for a fixed nominal size, but also for a range of relevant nominal sizes. The calculations required are both familiar and simple. We recommend that the methods be routinely applied to simulations studies that compare alternative tests of the same hypotheses.

Keywords

This publication has 11 references indexed in Scilit:

A simulation study comparing tests for the equality of coefficients of variation
Statistics in Medicine, 1998
Testing capture homogeneity in a recapture model
Biometrika, 1992
On the statistical analysis of ROC curves
Statistics in Medicine, 1989
A family of nonparametric statistics for comparing diagnostic markers with paired or unpaired data
Biometrika, 1989
A method of comparing the areas under receiver operating characteristic curves derived from the same cases.
Radiology, 1983
The meaning and use of the area under a receiver operating characteristic (ROC) curve.
Radiology, 1982
The area above the ordinal dominance graph and the area below the receiver operating characteristic graph
Journal of Mathematical Psychology, 1975
A NON‐PARAMETRIC MEASURE OF SIGNAL DISCRIMINABILITY
British Journal of Mathematical and Statistical Psychology, 1973
Implications of latency data for threshold and nonthreshold models of signal detection
Journal of Mathematical Psychology, 1972
Rates of Convergence of Estimates and Test Statistics
The Annals of Mathematical Statistics, 1967