Why do so many prognostic factors fail to pan out?
- 1 October 1992
- journal article
- Published by Springer Nature in Breast Cancer Research and Treatment
- Vol. 22 (3) , 197-206
- https://doi.org/10.1007/bf01840833
Abstract
Although there can be many reasons that one study fails to confirm the results of another, the consequences of data exploration and the potential for spuriously significant results are often overlooked. A series of simulation experiments were designed to mimic the characteristics of relapse-free survival data that might be encountered in a prognostic factor study of node-negative breast cancer patients. Each simulated dataset of 500 or 250 cases was divided into a training set, used to select the "best" prognostic factor cutpoint, and a validation set, used to confirm the cutpoint. Testing multiple cutpoints markedly increased the risk of making a Type I error. The power to detect even small true differences was substantial, and increased as the number of cutpoints increased. Regardless of the number of cutpoints tested on the training sets, the Type I error rate on an independent validation data set was quite stable and the power of the validation set to detect true differences was not related to the number of cutpoints. Validation power closely approximated that predicted for a simple two group comparison. It is therefore recommended that exploratory analyses of prognostic factors formally employ some method of adjusting for increased Type I errors, such as independent validation sets, ad hoc adjustment factors, or other statistical methods of estimating the true risk.Keywords
This publication has 7 references indexed in Scilit:
- Optimal Mastectomy TimingJNCI Journal of the National Cancer Institute, 1992
- Breast Cancer Prognostic Factors: Evaluation GuidelinesJNCI Journal of the National Cancer Institute, 1991
- HER-2/neu Oncogene Amplification and Expression in Human Mammary CarcinomaPublished by Elsevier ,1991
- Flow cytometry in primary breast cancer: improving the prognostic value of the fraction of cells in the S-phase by optimal categorisation of cut-off levelsBritish Journal of Cancer, 1990
- Martingale-based residuals for survival modelsBiometrika, 1990
- CRITLEVEL: An Exploratory Procedure for the Evaluation of Quantitative Prognostic FactorsMethods of Information in Medicine, 1984
- Planning the size and duration of a clinical trial studying the time to some critical eventJournal of Chronic Diseases, 1974