Multiple imputation in a large-scale complex survey: a practical guide
- 4 August 2009
- journal article
- research article
- Published by SAGE Publications in Statistical Methods in Medical Research
- Vol. 19 (6) , 653-670
- https://doi.org/10.1177/0962280208101273
Abstract
The Cancer Care Outcomes Research and Surveillance (CanCORS) Consortium is a multisite, multimode, multiwave study of the quality and patterns of care delivered to population-based cohorts of newly diagnosed patients with lung and colorectal cancer. As is typical in observational studies, missing data are a serious concern for CanCORS, following complicated patterns that impose severe challenges to the consortium investigators. Despite the popularity of multiple imputation of missing data, its acceptance and application still lag in large-scale studies with complicated data sets such as CanCORS. We use sequential regression multiple imputation, implemented in public-available software, to deal with non-response in the CanCORS surveys and construct a centralised completed database that can be easily used by investigators from multiple sites. Our work illustrates the feasibility of multiple imputation in a large-scale multiobjective survey, showing its capacity to handle complex missing data. We present the implementation process in detail as an example for practitioners and discuss some of the challenging issues which need further research.Keywords
This publication has 22 references indexed in Scilit:
- Evaluation of software for multiple imputation of semi-continuous dataStatistical Methods in Medical Research, 2007
- Sensitivity analysis after multiple imputation under missing at random: a weighting approachStatistical Methods in Medical Research, 2007
- Multiple Imputation for Model Checking: Completed‐Data Plots with Missing and Latent DataBiometrics, 2005
- Imputation for incomplete high‐dimensional multivariate normal data using a common factor modelStatistics in Medicine, 2004
- Understanding Cancer Treatment and Outcomes: The Cancer Care Outcomes Research and Surveillance ConsortiumJournal of Clinical Oncology, 2004
- Parameterization and Bayesian ModelingJournal of the American Statistical Association, 2004
- Not Asked and Not Answered: Multiple Imputation for Multiple SurveysJournal of the American Statistical Association, 1998
- On Variance Estimation with Imputed Survey DataJournal of the American Statistical Association, 1996
- Multiple Imputation after 18+ YearsJournal of the American Statistical Association, 1996
- Multiple Imputation of Industry and Occupation Codes in Census Public-use Samples Using Bayesian Logistic RegressionJournal of the American Statistical Association, 1991