Marginal Analysis of Incomplete Longitudinal Binary Data: A Cautionary Note on LOCF Imputation
- 27 August 2004
- journal article
- Published by Oxford University Press (OUP) in Biometrics
- Vol. 60 (3) , 820-828
- https://doi.org/10.1111/j.0006-341x.2004.00234.x
Abstract
SummaryIn recent years there has been considerable research devoted to the development of methods for the analysis of incomplete data in longitudinal studies. Despite these advances, the methods used in practice have changed relatively little, particularly in the reporting of pharmaceutical trials. In this setting, perhaps the most widely adopted strategy for dealing with incomplete longitudinal data is imputation by the “last observation carried forward” (LOCF) approach, in which values for missing responses are imputed using observations from the most recently completed assessment. We examine the asymptotic and empirical bias, the empirical type I error rate, and the empirical coverage probability associated with estimators and tests of treatment effect based on the LOCF imputation strategy. We consider a setting involving longitudinal binary data with longitudinal analyses based on generalized estimating equations, and an analysis based simply on the response at the end of the scheduled follow‐up. We find that for both of these approaches, imputation by LOCF can lead to substantial biases in estimators of treatment effects, the type I error rates of associated tests can be greatly inflated, and the coverage probability can be far from the nominal level. Alternative analyses based on all available data lead to estimators with comparatively small bias, and inverse probability weighted analyses yield consistent estimators subject to correct specification of the missing data process. We illustrate the differences between various methods of dealing with drop‐outs using data from a study of smoking behavior.Keywords
This publication has 28 references indexed in Scilit:
- Marginal Methods for Incomplete Longitudinal Data Arising in ClustersJournal of the American Statistical Association, 2002
- Estimation in an empirical bayes model for longitudinal and cross-sectionally clustered binary dataThe Canadian Journal of Statistics / La Revue Canadienne de Statistique, 2000
- Semiparametric Regression for Repeated Outcomes with Nonignorable NonresponseJournal of the American Statistical Association, 1998
- Semiparametric Regression for Repeated Outcomes with Nonignorable NonresponseJournal of the American Statistical Association, 1998
- Modeling the Drop-Out Mechanism in Repeated-Measures StudiesJournal of the American Statistical Association, 1995
- Analysis of Semiparametric Regression Models for Repeated Outcomes in the Presence of Missing DataJournal of the American Statistical Association, 1995
- Analysis of Semiparametric Regression Models for Repeated Outcomes in the Presence of Missing DataJournal of the American Statistical Association, 1995
- Marginal Modeling of Correlated Ordinal Data Using a Multivariate Plackett DistributionJournal of the American Statistical Association, 1994
- Missing Data, Imputation, and the BootstrapJournal of the American Statistical Association, 1994
- Longitudinal data analysis using generalized linear modelsBiometrika, 1986