4. Regression with Missing Ys: An Improved Strategy for Analyzing Multiply Imputed Data
Top Cited Papers
Open Access
- 1 August 2007
- journal article
- Published by SAGE Publications in Sociological Methodology
- Vol. 37 (1) , 83-117
- https://doi.org/10.1111/j.1467-9531.2007.00180.x
Abstract
When fitting a generalized linear model—such as linear regression, logistic regression, or hierarchical linear modeling—analysts often wonder how to handle missing values of the dependent variable Y. If missing values have been filled in using multiple imputation, the usual advice is to use the imputed Y values in analysis. We show, however, that using imputed Ys can add needless noise to the estimates. Better estimates can usually be obtained using a modified strategy that we call multiple imputation, then deletion (MID). Under MID, all cases are used for imputation but, following imputation, cases with imputed Y values are excluded from the analysis. When there is something wrong with the imputed Y values, MID protects the estimates from the problematic imputations. And when the imputed Y values are acceptable, MID usually offers somewhat more efficient estimates than an ordinary MI strategy.Keywords
All Related Versions
This publication has 18 references indexed in Scilit:
- TEACHER'S CORNER: How Many Imputations Are Needed? A Comment on Hershberger and Fisher (2003)Structural Equation Modeling: A Multidisciplinary Journal, 2005
- Are Schools the Great Equalizer? Cognitive Inequality during the Summer Months and the School YearAmerican Sociological Review, 2004
- Discussion on Multiple ImputationInternational Statistical Review, 2003
- A Potential for Bias When Rounding in Multiple ImputationThe American Statistician, 2003
- Comparing Personal Trajectories and Drawing Causal Inferences from Longitudinal DataAnnual Review of Psychology, 2001
- Multiple Imputation for Missing DataSociological Methods & Research, 2000
- Multiple Imputation after 18+ YearsJournal of the American Statistical Association, 1996
- Performing likelihood ratio tests with multiply-imputed data setsBiometrika, 1992
- Multiple Imputation for Interval Estimation from Simple Random Samples with Ignorable NonresponseJournal of the American Statistical Association, 1986
- A generalized multivariate analysis of variance model useful especially for growth curve problemsBiometrika, 1964