Adjustment for Response Bias Via Two-phase Analysis

Abstract
Background: Records-based studies often have limited covariate data, leading some researchers to collect survey data on a subset. Results for survey responders may be biased due to selective nonresponse and will be less precise due to the decreased responder sample size. We use data from a study of air pollution and birth outcomes to illustrate how a 2-phase analysis can yield less biased and more precise results. Methods: Our phase 1 group was a cohort of Los Angeles births from which we obtained a phase 2 group of survey responders. We compared estimates for the odds ratio (OR) between entire pregnancy carbon monoxide (CO) exposure and low birth weight in the first- and second-phase groups, adjusting only for variables available for both groups. Results: For CO exposure of 1 part per million or higher, the conventional adjusted ORs and 95% confidence intervals for low birth weight were 1.15 (1.06–1.25) and 1.33 (1.06–1.68) for the phase 1 and 2 groups, suggesting a possible response bias and decreased precision in the latter estimate. We performed 2-phase analyses of the survey responders and found results similar to those for the cohort when we accounted for possible differential response by CO exposure. In our final analysis, we included both birth record and survey variables in a 2-phase model corrected for possible response bias. The results from weighted-, pseudo-, and maximum-likelihood were similar: 1.13 (1.03–1.25); 1.14 (1.01–1.29); and 1.10 (0.97–1.24), respectively. Conclusion: Our approach provides a means of checking for response bias and adjusting both point and interval estimates to account for differential response.