Addressing an Idiosyncrasy in Estimating Survival Curves Using Double Sampling in the Presence of Self‐Selected Right Censoring
- 1 June 2001
- journal article
- Published by Oxford University Press (OUP) in Biometrics
- Vol. 57 (2) , 333-342
- https://doi.org/10.1111/j.0006-341x.2001.00333.x
Abstract
Summary. We investigate the use of follow‐up samples of individuals to estimate survival curves from studies that are subject to right censoring from two sources: (i) early termination of the study, namely, administrative censoring, or (ii) censoring due to lost data prior to administrative censoring, so‐called dropout. We assume that, for the full cohort of individuals, administrative censoring times are independent of the subjects' inherent characteristics, including survival time. To address the loss to censoring due to dropout, which we allow to be possibly selective, we consider an intensive second phase of the study where a representative sample of the originally lost subjects is subsequently followed and their data recorded. As with double‐sampling designs in survey methodology, the objective is to provide data on a representative subset of the dropouts. Despite assumed full response from the follow‐up sample, we show that, in general in our setting, administrative censoring times are not independent of survival times within the two subgroups, nondropouts and sampled dropouts. As a result, the stratified Kaplan–Meier estimator is not appropriate for the cohort survival curve. Moreover, using the concept of potential outcomes, as opposed to observed outcomes, and thereby explicitly formulating the problem as a missing data problem, reveals and addresses these complications. We present an estimation method based on the likelihood of an easily observed subset of the data and study its properties analytically for large samples. We evaluate our method in a realistic situation by simulating data that match published margins on survival and dropout from an actual hip‐replacement study. Limitations and extensions of our design and analytic method are discussed.Keywords
This publication has 35 references indexed in Scilit:
- Estimation of the Causal Effect of a Time-Varying Exposure on the Marginal Mean of a Repeated Binary OutcomeJournal of the American Statistical Association, 1999
- Estimation of the Causal Effect of a Time-Varying Exposure on the Marginal Mean of a Repeated Binary Outcome: CommentJournal of the American Statistical Association, 1999
- Multiple Imputation after 18+ YearsJournal of the American Statistical Association, 1996
- Semiparametric Efficiency in Multivariate Regression Models with Missing DataJournal of the American Statistical Association, 1995
- Adjusting for Differential Rates of Prophylaxis Therapy for PCP in High-Versus Low-Dose AZT Treatment Arms in an AIDS Randomized TrialJournal of the American Statistical Association, 1994
- Estimation of Regression Coefficients When Some Regressors are not Always ObservedJournal of the American Statistical Association, 1994
- Missing Data, Imputation, and the BootstrapJournal of the American Statistical Association, 1994
- Multiple Imputation in Mixture Models for Nonignorable Nonresponse with Follow-upsJournal of the American Statistical Association, 1993
- Statistics and Causal InferenceJournal of the American Statistical Association, 1986
- Nonparametric Estimation from Incomplete ObservationsJournal of the American Statistical Association, 1958