Imputing Cross-Sectional Missing Data: Comparison of Common Techniques
Top Cited Papers
- 1 July 2005
- journal article
- research article
- Published by SAGE Publications in Australian & New Zealand Journal of Psychiatry
- Vol. 39 (7) , 583-590
- https://doi.org/10.1080/j.1440-1614.2005.01630.x
Abstract
Objective: Increasing awareness of how missing data affects the analysis of clinical and public health interventions has led to increasing numbers of missing data procedures. There is little advice regarding which procedures should be selected under different circumstances. This paper compares six popular procedures: listwise deletion, item mean substitution, person mean substitution at two levels, regression imputation and hot deck imputation. Method: Using a complete dataset, each was examined under a variety of sample sizes and differing levels ofmissing data. The criteria were the true t-values for the entire sample. Results: The results suggest important differences. Ifmissing data are from a scale where about half the items are present, hot deck imputation or person mean substitution are best. Because person mean substitution is computationally simpler, similar in its efficiency, advocated by other researchers and more likely to be an option on statistical software packages, it is the method of choice. If the missing data are from a scale where more than half the items are missing, or with single-item measures, then hot deck imputation is recommended. The findings also showed that listwise deletion and item mean substitution performed poorly. Conclusions: Person mean and hot deck imputation are preferred. Since listwise deletion and item mean substitution performed poorly, yet are the most widely reported methods, the findings have broad implications.Keywords
This publication has 24 references indexed in Scilit:
- Missing Data in Multiple Item Scales: A Monte Carlo Analysis of Missing Data TechniquesOrganizational Research Methods, 1999
- Psychometric properties of the PTSD checklist (PCL)Published by Elsevier ,1999
- Missing Data in Likert Ratings: A Comparison of Replacement MethodsThe Journal of General Psychology, 1998
- Application of random-effects pattern-mixture models for missing data in longitudinal studies.Psychological Methods, 1997
- The Analysis of Repeated Categorical Measurements Subject to Nonignorable NonresponseJournal of the American Statistical Association, 1992
- Using EM to Obtain Asymptotic Variance-Covariance Matrices: The SEM AlgorithmJournal of the American Statistical Association, 1991
- Comparative measurement efficiency and sensitivity of five health status instruments for arthritis researchArthritis & Rheumatism, 1985
- The Hospital Anxiety and Depression ScaleActa Psychiatrica Scandinavica, 1983
- Inference and missing dataBiometrika, 1976
- A Basis for Scaling Qualitative DataAmerican Sociological Review, 1944