Bias due to missing exposure data using complete‐case analysis in the proportional hazards regression model
- 28 January 2003
- journal article
- research article
- Published by Wiley in Statistics in Medicine
- Vol. 22 (4) , 545-557
- https://doi.org/10.1002/sim.1340
Abstract
We studied bias due to missing exposure data in the proportional hazards regression model when using complete‐case analysis (CCA). Eleven missing data scenarios were considered: one with missing completely at random (MCAR), four missing at random (MAR), and six non‐ignorable missingness scenarios, with a variety of hazard ratios, censoring fractions, missingness fractions and sample sizes. When missingness was MCAR or dependent only on the exposure, there was negligible bias (2–3 per cent) that was similar to the difference between the estimate in the full data set with no missing data and the true parameter. In contrast, substantial bias occurred when missingness was dependent on outcome or both outcome and exposure. For models with hazard ratio of 3.5, a sample size of 400, 20 per cent censoring and 40 per cent missing data, the relative bias for the hazard ratio ranged between 7 per cent and 64 per cent. We observed important differences in the direction and magnitude of biases under the various missing data mechanisms. For example, in scenarios where missingness was associated with longer or shorter follow‐up, the biases were notably different, although both mechanisms are MAR. The hazard ratio was underestimated (with larger bias) when missingness was associated with longer follow‐up and overestimated (with smaller bias) when associated with shorter follow‐up. If it is known that missingness is associated with a less frequently observed outcome or with both the outcome and exposure, CCA may result in an invalid inference and other methods for handling missing data should be considered. Copyright © 2003 John Wiley & Sons, Ltd.Keywords
This publication has 21 references indexed in Scilit:
- Likelihood-Based Methods for Missing Covariates in the Cox Proportional Hazards ModelJournal of the American Statistical Association, 2001
- Proportional Hazards Regression with Missing CovariatesJournal of the American Statistical Association, 1999
- On using the Cox proportional hazards model with missing covariatesBiometrika, 1997
- Indicator and Stratification Methods for Missing Explanatory Variables in Multiple Linear RegressionJournal of the American Statistical Association, 1996
- Estimation of Regression Coefficients When Some Regressors are not Always ObservedJournal of the American Statistical Association, 1994
- Cox Regression with Incomplete Covariate MeasurementsJournal of the American Statistical Association, 1993
- Regression With Missing X's: A ReviewJournal of the American Statistical Association, 1992
- A Nonparametric Method for Dealing with Mismeasured Covariate DataJournal of the American Statistical Association, 1991
- Incomplete Data in Generalized Linear ModelsJournal of the American Statistical Association, 1990
- Log-Linear Analysis of Censored Survival Data with Partially Observed CovariatesJournal of the American Statistical Association, 1989