Pattern-Mixture Models for Multivariate Incomplete Data

1 March 1993

journal article
research article
Published by Taylor & Francis in Journal of the American Statistical Association

Vol. 88 (421) , 125-134
https://doi.org/10.1080/01621459.1993.10594302

Abstract

Consider a random sample on variables X₁, …, X_v with some values of X_v missing. Selection models specify the distribution of X₁ , …, X_V over respondents and nonrespondents to X_v , and the conditional distribution that X_v is missing given X₁ , …, X_v . In contrast, pattern-mixture models specify the conditional distribution of X ₁, …, X_v given that X_V is observed or missing respectively and the marginal distribution of the binary indicator for whether or not X_v is missing. For multivariate data with a general pattern of missing values, the literature has tended to adopt the selection-modeling approach (see for example Little and Rubin); here, pattern-mixture models are proposed for this more general problem. Pattern-mixture models are chronically underidentified; in particular for the case of univariate nonresponse mentioned above, there are no data on the distribution of X_v given X₁ , …, X_V–1 , in the stratum with X_v missing. Thus the models require restrictions or prior information to identify the parameters. Complete-case restrictions tie unidentified parameters to their (identified) analogs in the stratum of complete cases. Alternative types of restriction tie unidentified parameters to parameters in other missing-value patterns or sets of such patterns. This large set of possible identifying restrictions yields a rich class of missing-data models. Unlike ignorable selection models, which generally requires iterative methods except for special missing-data patterns, some pattern-mixture models yield explicit ML estimates for general patterns. Such models are readily amenable to Bayesian methods and form a convenient basis for multiple imputation. Some previously considered noniterative estimation methods are shown to be maximum likelihood (ML) under a pattern-mixture model. For example, Buck's method for continuous data, corrected as in Beale and Little (1975), and Brown's estimators for nonrandomly missing data are ML for pattern-mixture models with particular complete-case restrictions. Available-case analyses, where the mean and variance of X_j are computed using all cases with X_j observed and the correlation (or covariance) of X_j and X_k is computed using all cases with X_j and X_k observed, are also close to ML for another pattern-mixture model. Asymptotic theory for this class of estimators is outlined.

Keywords

This publication has 22 references indexed in Scilit:

An approximation to maximum likelihood estimates in reduced models
Biometrika, 1990
Approximately calibrated small sample inference about means from bivariate normal data with missing values
Computational Statistics & Data Analysis, 1988
A Test of Missing Completely at Random for Multivariate Data with Missing Values
Journal of the American Statistical Association, 1988
Regression Analysis for Categorical Variables with Outcome Subject to Nonignorable Nonresponse
Journal of the American Statistical Association, 1988
Causal Models for Patterns of Nonresponse
Journal of the American Statistical Association, 1986
Selection Modeling Versus Mixture Modeling with Nonignorable Nonresponse
Published by Springer Nature ,1986
Imputation of Missing Values When the Probability of Response Depends on the Variable Being Imputed
Journal of the American Statistical Association, 1982
Models for Nonresponse in Sample Surveys
Journal of the American Statistical Association, 1982
Two-Dimensional Contingency Tables with Both Completely and Partially Cross-Classified Data
Published by JSTOR ,1974
Maximum Likelihood Estimates for a Multivariate Normal Distribution when Some Observations are Missing
Journal of the American Statistical Association, 1957