Construction and validation of a prognostic model across several studies, with an application in superficial bladder cancer
- 3 March 2004
- journal article
- research article
- Published by Wiley in Statistics in Medicine
- Vol. 23 (6) , 907-926
- https://doi.org/10.1002/sim.1691
Abstract
Many models for clinical prediction (prognosis or diagnosis) are published in the medical literature every year but few such models find their way into clinical practice. The reason may be that since in most cases models have not been validated in independent data, they lack generality and/or credibility. In this paper we consider the situation in which several compatible, independent data sets relating to a given disease with a time‐to‐event endpoint are available for analysis. The aim is to construct and evaluate a single prognostic model. Building a multivariable model from the available prognostic factors is accomplished within the Cox proportional hazards framework, stratifying by study. Non‐linear relationships with continuous predictors are modelled by using fractional polynomials. To assess the discrimination or separation of a survival model, we use the D statistic of Royston and Sauerbrei. D may be interpreted as the separation (log hazard ratio) between the survival distributions for two independent prognostic groups. To evaluate the generality of a prognostic model across the data sets, we propose ‘internal–external cross‐validation’ on D: each study is omitted in turn, the model parameters are estimated from the remaining studies and D is evaluated in the omitted study. Because the linear predictor of a survival model tells only part of the story, we also suggest a method for investigating heterogeneity in the baseline distribution function across studies which involves fitting completely specified, flexible parametric survival models (Royston and Parmar). Our final models combine the prognostic index (obtained with stratification by study) with the pooled baseline survival distribution (estimated parametrically). By applying this methodology, we construct two prognostic scores in superficial bladder cancer. The simpler of the two scores is more suited to clinical application. We show that a three‐group prognostic classification scheme based on either score produces well‐separated survival curves for each of the data sets, despite identifiable heterogeneity among the baseline distribution functions and to a lesser extent among the prognostic indexes for the individual studies. Copyright © 2004 John Wiley & Sons, Ltd.Keywords
This publication has 28 references indexed in Scilit:
- A new measure of prognostic separation in survival dataStatistics in Medicine, 2004
- Validation, calibration, revision and combination of prognostic survival modelsStatistics in Medicine, 2000
- What do we mean by validating a prognostic model?Statistics in Medicine, 2000
- The Use of Resampling Methods to Simplify Regression Models in Medical StatisticsJournal of the Royal Statistical Society Series C: Applied Statistics, 1999
- Assessment and comparison of prognostic classification schemes for survival dataStatistics in Medicine, 1999
- Validation of existing and development of new prognostic classification schemes in node negative breast cancerBreast Cancer Research and Treatment, 1997
- Commentary: Prognostic models: clinically useful or quickly forgotten?BMJ, 1995
- The Nottingham Prognostic Index applied to 9,149 patients from the studies of the Danish Breast Cancer Cooperative Group (DBCG)Breast Cancer Research and Treatment, 1994
- Confirmation of a long-term prognostic index in breast cancerThe Breast, 1993
- The Nottingham prognostic index in primary breast cancerBreast Cancer Research and Treatment, 1992