Abstract
Data can arise as a length-biased sample rather than as a random sample; e.g. a sample of patients in hospitals or of network cable lines (experimental units with longer stays or longer lines have greater likelihoods of being sampled). The distribution arising from a single length-biased sampling (LBS) time has been derived (e.g. (The Statistical Analysis of Discrete Time Events. Oxford Press: London, 1972)) and applies when the observed outcome relates to the random variable subjected to LBS. Zelen (Breast Cancer: Trends in Research and Treatment. Raven Press: New York, 1976; 287–301) noted that cases of disease detected from a screening program likewise form a length-biased sample among all cases, since longer sojourn times afford greater likelihoods of being screen detected. In contrast to the samples on hospital stays and cable lines, however, the length-biased sojourns (preclinical durations) cannot be observed, although their subsequent clinical durations (survival times) are. This article quantifies the effect of LBS of the sojourn times (or pre-clinical durations) on the distribution of the observed clinical durations when cases undergo periodic screening for the early detection of disease. We show that, when preclinical and clinical durations are positively correlated, the mean, median, and quartiles of the distribution of the clinical duration from screen-detected cases can be substantially inflated—even in the absence of any benefit on survival from the screening procedure. Screening studies that report mean survival time need to take account of the fact that, even in the absence of any real benefit, the mean survival among cases in the screen-detected group will be longer than that among interval cases or among cases that arise in the control arm, above and beyond lead time bias, simply by virtue of the LBS phenomenon. Published in 2009 by John Wiley & Sons, Ltd.
Funding Information
  • Army Research Office (W911NF-05-1-0490)
  • National Science Foundation (DMS-08-02295)