Multiple Imputation for Missing Data: Fully Conditional Specification Versus Multivariate Normal Imputation
Top Cited Papers
Open Access
- 27 January 2010
- journal article
- research article
- Published by Oxford University Press (OUP) in American Journal of Epidemiology
- Vol. 171 (5) , 624-632
- https://doi.org/10.1093/aje/kwp425
Abstract
Statistical analysis in epidemiologic studies is often hindered by missing data, and multiple imputation is increasingly being used to handle this problem. In a simulation study, the authors compared 2 methods for imputation that are widely available in standard software: fully conditional specification (FCS) or “chained equations” and multivariate normal imputation (MVNI). The authors created data sets of 1,000 observations to simulate a cohort study, and missing data were induced under 3 missing-data mechanisms. Imputations were performed using FCS (Royston's “ice”) and MVNI (Schafer's NORM) in Stata (Stata Corporation, College Station, Texas), with transformations or prediction matching being used to manage nonnormality in the continuous variables. Inferences for a set of regression parameters were compared between these approaches and a complete-case analysis. As expected, both FCS and MVNI were generally less biased than complete-case analysis, and both produced similar results despite the presence of binary and ordinal variables that clearly did not follow a normal distribution. Ignoring skewness in a continuous covariate led to large biases and poor coverage for the corresponding regression parameter under both approaches, although inferences for other parameters were largely unaffected. These results provide reassurance that similar results can be expected from FCS and MVNI in a standard regression analysis involving variously scaled variables.Keywords
This publication has 24 references indexed in Scilit:
- Missing values in longitudinal dietary data: A multiple imputation approach based on a fully conditional specificationStatistics in Medicine, 2009
- Reducing Psychosocial and Behavioral Pregnancy Risk Factors: Results of a Randomized Clinical Trial Among High-Risk Pregnant African American WomenAmerican Journal of Public Health, 2009
- Multiple Imputation With Large Data Sets: A Case Study of the Children's Mental Health InitiativeAmerican Journal of Epidemiology, 2009
- The Efficacy of Female Condom Skills Training in HIV Risk Reduction Among Women: A Randomized Controlled TrialAmerican Journal of Public Health, 2008
- Use of Multiple Imputation in the Epidemiologic LiteratureAmerican Journal of Epidemiology, 2008
- Using Calibration to Improve Rounding in ImputationThe American Statistician, 2008
- Plausibility of multivariate normality assumption when multiply imputing non-Gaussian continuous outcomes: a simulation assessmentJournal of Statistical Computation and Simulation, 2008
- Much Ado About NothingThe American Statistician, 2007
- Robustness of a multivariate normal approximation for imputation of incomplete binary dataStatistics in Medicine, 2006
- A Multiple-Imputation Analysis of a Case-Control Study of the Risk of Primary Cardiac Arrest Among Pharmacologically Treated HypertensivesJournal of the Royal Statistical Society Series C: Applied Statistics, 1996