Models for Categorical Data with Nonignorable Nonresponse

1 March 1994

journal article
research article
Published by JSTOR in Journal of the American Statistical Association

Vol. 89 (425) , 44
https://doi.org/10.2307/2291199

Abstract

When categorical outcomes are subject to nonignorable nonresponse, log-linear models may be used to adjust for the nonresponse. The models are fitted to the data in an augmented frequency table in which one index corresponds to whether or not the subject is a respondent. The likelihood function is maximized over pseudo-observed cell frequencies with respect to this log-linear model using an EM algorithm. Each E step of the EM algorithm determines the pseudo-observed cell frequencies, and the M step yields the maximum likelihood estimators (MLE's) of these pseudo-observed cell frequencies. This approach may produce boundary estimates for the expected cell frequencies of the nonrespondents. In these cases the estimators of the log-linear model parameters are not uniquely determined and may be unstable. Following the approach of Clogg et al., we propose a Bayesian method that uses smoothing constants to adjust the pseudo-observed cell frequencies so that the solution is not on the boundary. The role of smoothing constants is similar to that of the flattening constant k in ridge regression; the use of k is intended to overcome ill-conditioned situations where correlations between the various predictors in the regression model produce unstable parameter estimates. The Bayesian estimation procedure is illustrated using data from a cross-sectional study of obesity in school-age children. Through a simulation study, we show that when fitting nonignorable nonresponse models, the mean squared errors of the expected cell frequencies obtained by the Bayesian procedure can be much smaller than those of the MLE's.

Keywords

This publication has 0 references indexed in Scilit: