Use of Missing-Data Methods to Correct Bias and Improve Precision in Case-Control Studies in which Cases Are Subtyped but Subtype Information Is Incomplete
Open Access
- 15 November 2001
- journal article
- research article
- Published by Oxford University Press (OUP) in American Journal of Epidemiology
- Vol. 154 (10) , 954-962
- https://doi.org/10.1093/aje/154.10.954
Abstract
Histologic and genetic markers can sometimes make it possible to refine a disease into subtypes. In a case-control study, an attempt to subcategorize a disease in this way can be important to elucidating its etiology if the subtypes tend to result from distinct causal pathways. Using subtyped case outcomes, one can carry out either a case-case analysis to investigate etiologic heterogeneity or do polytomous logistic regression to estimate odds ratios specific to subtypes. Unfortunately, especially when such an analysis is undertaken after the study has been completed, it may be compromised by the unavailability of tissue specimens, resulting in missing subtype data for many enrolled cases. The authors propose that one can more fully use the available data, including that provided by cases with missing subtype, by using the expectation-maximization algorithm to estimate risk parameters. For illustration, they apply the method to a study of non-Hodgkin's lymphoma in the midwestern United States. The simulations then demonstrate that, under assumptions likely to hold in many settings, the approach eliminates bias that would arise if unclassified cases were ignored and also improves the precision of estimation. Under the same assumptions, empirical confidence interval coverage is consistent with the nominal 95%.Keywords
This publication has 0 references indexed in Scilit: