Validation and updating of predictive logistic regression models: a study on sample size and shrinkage
Top Cited Papers
- 27 July 2004
- journal article
- research article
- Published by Wiley in Statistics in Medicine
- Vol. 23 (16) , 2567-2586
- https://doi.org/10.1002/sim.1844
Abstract
A logistic regression model may be used to provide predictions of outcome for individual patients at another centre than where the model was developed. When empirical data are available from this centre, the validity of predictions can be assessed by comparing observed outcomes and predicted probabilities. Subsequently, the model may be updated to improve predictions for future patients.As an example, we analysed 30‐day mortality after acute myocardial infarction in a large data set (GUSTO‐I, n = 40 830). We validated and updated a previously published model from another study (TIMI‐II, n = 3339) in validation samples ranging from small (200 patients, 14 deaths) to large (10 000 patients, 700 deaths). Updated models were tested on independent patients. Updating methods included re‐calibration (re‐estimation of the intercept or slope of the linear predictor) and more structural model revisions (re‐estimation of some or all regression coefficients, model extension with more predictors). We applied heuristic shrinkage approaches in the model revision methods, such that regression coefficients were shrunken towards their re‐calibrated values. Parsimonious updating methods were found preferable to more extensive model revisions, which should only be attempted with relatively large validation samples in combination with shrinkage. Copyright © 2004 John Wiley & Sons, Ltd.Keywords
This publication has 33 references indexed in Scilit:
- Prognostic modelling with logistic regression analysis: a comparison of selection and estimation methods in small data setsStatistics in Medicine, 2000
- Stepwise Selection in Small Data SetsPublished by Elsevier ,1999
- Better Subset Regression Using the Nonnegative GarroteTechnometrics, 1995
- The Covariance Decomposition of the Probability Score and Its Use in Evaluating Prognostic EstimatesMedical Decision Making, 1995
- Hospital mortality in acute myocardial infarction in the era of reperfusion therapy (the myocardial infarction triage and intervention project)The American Journal of Cardiology, 1993
- An International Randomized Trial Comparing Four Thrombolytic Strategies for Acute Myocardial InfarctionNew England Journal of Medicine, 1993
- Comparison of Invasive and Conservative Strategies after Treatment with Intravenous Tissue Plasminogen Activator in Acute Myocardial InfarctionNew England Journal of Medicine, 1989
- Short-term risk stratification at admission based on simple clinical data in acute myocardial infarctionThe American Journal of Cardiology, 1988
- Probabilistic prediction in patient management and clinical trialsStatistics in Medicine, 1986
- Regression modelling strategies for improved prognostic predictionStatistics in Medicine, 1984