Efficiency of Synthetic Retrospective Studies

Abstract
In many large cohort studies of association between a disease and a concommitant variable, only a small fraction of subjects develope the disease. Substantial computational expense can be avoided by restricting the analysis to the diseased cases and a random sample of disease‐free controls. This paper examines the efficiency of such synthetic retrospective designs relative to that of the full cohort analysis when the association is studied using the logistic or proportional hazards model. Within this context the efficiencies of matched vs. unmatched designs are also examined.