Power of Tests for a Dichotomous Independent Variable Measured with Error

29 November 2007

journal article
Published by Wiley in Health Services Research

Vol. 43 (3) , 1085-1101
https://doi.org/10.1111/j.1475-6773.2007.00810.x

Abstract

Objective.To examine the implications for statistical power of using predicted probabilities for a dichotomous independent variable, rather than the actual variable.Data Sources/Study Setting.An application uses 271,479 observations from the 2000 to 2002 CAHPS Medicare Fee‐for‐Service surveys.Study Design and Data.A methodological study with simulation results and a substantive application to previously collected data.Principle Findings.Researchers often must employ key dichotomous predictors that are unobserved but for which predictions exist. We consider three approaches to such data: theclassification estimator(1); thedirect substitution estimator(2); thepartial information maximum likelihood estimator(3, PIMLE). The efficiency of (1) (its power relative to testing with the true variable) roughly scales with the square of one less the classification error. The efficiency of (2) roughly scales with theR²for predicting the unobserved dichotomous variable, and is usually more powerful than (1). Approach (3) is most powerful, but for testing differences in means of 0.2–0.5 standard deviations, (2) is typically more than 95 percent as efficient as (3).Conclusions.The information loss from not observing actual values of dichotomous predictors can be quite large. Direct substitution is easy to implement and interpret and nearly as efficient as the PIMLE.

Keywords

This publication has 9 references indexed in Scilit:

Causal Inferences with Group Based Trajectory Models
Psychometrika, 2005
Estimating probit models with self-selected treatments
Statistics in Medicine, 2005
Estimating Returns to Schooling When Schooling is Misreported
Published by National Bureau of Economic Research ,1999
Misclassification of a prognostic dichotomous variable: Sample size and parameter estimate adjustment
Statistics in Medicine, 1995
Measurement Error Models
Published by Wiley ,1987
Multiple Imputation for Nonresponse in Surveys
Published by Wiley ,1987
A Sample Size Formula for Multiple Regression Studies
Public Opinion Quarterly, 1986
Maximum Likelihood from Incomplete Data Via the EM Algorithm
Journal of the Royal Statistical Society Series B: Statistical Methodology, 1977