Practical experiences on the necessity of external validation

1 October 2007

journal article
research article
Published by Wiley in Statistics in Medicine

Vol. 26 (30) , 5499-5511
https://doi.org/10.1002/sim.3069

Abstract

The validity of prognostic models is an important prerequisite for their applicability in practical clinical settings. Here, we report on a specific prognostic study on stroke patients and describe how we explored the prediction performance of our model. We considered two practically highly relevant generalization aspects, namely, the model's performance in patients recruited at a later time point (temporal transportability) and in medical centers different from those used for model building (geographic transportability). To estimate the accuracy of the model, we investigated classical internal validation techniques and leave‐one‐center‐out cross validation (CV). Prognostic models predicting functional independence of stroke patients were developed in a training set using logistic regression, support vector machines, and random forests (RFs). Tenfold CV and leave‐one‐center‐out CV were employed to estimate temporal and geographic transportability of the models. For temporal and external validation, the resulting models were used to classify patients from a later time point and from different clinics. When applying the regression model or the RFs, accuracy in the temporal validation data was well predicted from classical internal validation. However, when predicting geographic transportability all approaches had difficulties. We observed that the leave‐one‐center‐out CV yielded better estimates than classical CV. On the basis of our results, we conclude that external validation in patients from different clinics is required before a prognostic model can be applied in practice. Even validating the model in patients recruited merely at a later time point does not suffice to predict how it may fare with regard to another clinic. Copyright © 2007 John Wiley & Sons, Ltd.

Keywords

This publication has 24 references indexed in Scilit:

Classifier Technology and the Illusion of Progress
Statistical Science, 2006
Some prognostic models for traumatic brain injury were not valid
Journal of Clinical Epidemiology, 2006
External validation is necessary in prediction research:
Journal of Clinical Epidemiology, 2003
External validity of predictive models: a comparison of logistic regression, classification trees, and neural networks
Journal of Clinical Epidemiology, 2003
Internal validation of predictive models
Journal of Clinical Epidemiology, 2001
Random Forests
Machine Learning, 2001
Users' Guides to the Medical Literature
JAMA, 2000
What do we mean by validating a prognostic model?
Statistics in Medicine, 2000
Assessing the Generalizability of Prognostic Information
Annals of Internal Medicine, 1999
Validation techniques for logistic regression models
Statistics in Medicine, 1991