Plausibility of multivariate normality assumption when multiply imputing non-Gaussian continuous outcomes: a simulation assessment
- 16 January 2008
- journal article
- research article
- Published by Taylor & Francis in Journal of Statistical Computation and Simulation
- Vol. 78 (1) , 69-84
- https://doi.org/10.1080/10629360600903866
Abstract
Multiple imputation under the assumption of multivariate normality has emerged as a frequently used model-based approach in dealing with incomplete continuous data in recent years. Despite its simplicity and popularity, however, its plausibility has not been thoroughly evaluated via simulation. In this work, the performance of multiple imputation under a multivariate Gaussian model with unstructured covariances was examined on a broad range of simulated incomplete data sets that exhibit varying distributional characteristics such as skewness and multimodality that are not accommodated by a Gaussian model. Behavior of efficiency and accuracy measures was explored to determine the extent to which the procedure works properly. The conclusion drawn is that although the real data rarely conform with multivariate normality, imputation under the assumption of normality is a fairly reasonable tool, even when the assumption of normality is clearly violated; the fraction of missing information is high, especially when the sample size is relatively large. Although we discourage its uncritical, automatic and, possibly, inappropriate use, we report that its performance is better than we expected, leading us to believe that it is probably an underrated approach.Keywords
This publication has 14 references indexed in Scilit:
- Tukey'sghDistribution for Multiple ImputationThe American Statistician, 2006
- Multiple imputation under Bayesianly smoothed pattern-mixture models for non-ignorable drop-outStatistics in Medicine, 2005
- Simulation driven inferences for multiply imputed longitudinal datasets*Statistica Neerlandica, 2004
- On the performance of random‐coefficient pattern‐mixture models for non‐ignorable drop‐outStatistics in Medicine, 2003
- Computational Strategies for Multivariate Linear Mixed-Effects Models With Missing ValuesJournal of Computational and Graphical Statistics, 2002
- A comparison of inclusive and restrictive strategies in modern missing data procedures.Psychological Methods, 2001
- Multiple imputation: a primerStatistical Methods in Medical Research, 1999
- Multiple Imputation After 18+ YearsJournal of the American Statistical Association, 1996
- The Calculation of Posterior Distributions by Data AugmentationJournal of the American Statistical Association, 1987
- Inference and missing dataBiometrika, 1976