Evaluation of software for multiple imputation of semi-continuous data
- 1 June 2007
- journal article
- conference paper
- Published by SAGE Publications in Statistical Methods in Medical Research
- Vol. 16 (3) , 243-258
- https://doi.org/10.1177/0962280206074464
Abstract
It is now widely accepted that multiple imputation (MI) methods properly handle the uncertainty of missing data over single imputation methods. Several standard statistical software packages, such as SAS, R and STATA, have standard procedures or user-written programs to perform MI. The performance of these packages is generally acceptable for most types of data. However, it is unclear whether these applications are appropriate for imputing data with a large proportion of zero values resulting in a semi-continuous distribution. In addition, it is not clear whether the use of these applications is suitable when the distribution of the data needs to be preserved for subsequent analysis. This article reports the findings of a simulation study carried out to evaluate the performance of the MI procedures for handling semi-continuous data within these statistical packages. Complete resource use data on 1060 participants from a large randomized clinical trial were used as the simulation population from which 500 bootstrap samples were obtained and missing data imposed. The findings of this study showed differences in the performance of the MI programs when imputing semi-continuous data. Caution should be exercised when deciding which program should perform MI on this type of data.This publication has 29 references indexed in Scilit:
- Fully conditional specification in multivariate imputationJournal of Statistical Computation and Simulation, 2006
- Surgical stabilisation of the spine compared with a programme of intensive rehabilitation for the management of patients with chronic low back pain: cost utility analysis based on a randomised controlled trialBMJ, 2005
- Multiple Imputation for Incomplete Data With Semicontinuous VariablesJournal of the American Statistical Association, 2003
- Missing.... presumed at random: cost‐analysis of incomplete dataHealth Economics, 2002
- International Subarachnoid Aneurysm Trial (ISAT) of neurosurgical clipping versus endovascular coiling in 2143 patients with ruptured intracranial aneurysms: a randomised trialThe Lancet, 2002
- Missing data: Our view of the state of the art.Psychological Methods, 2002
- Multiple Imputation for Missing DataSociological Methods & Research, 2000
- The Calculation of Posterior Distributions by Data AugmentationJournal of the American Statistical Association, 1987
- Multiple Imputation for Interval Estimation from Simple Random Samples with Ignorable NonresponseJournal of the American Statistical Association, 1986
- The central role of the propensity score in observational studies for causal effectsBiometrika, 1983