How Biased is the Apparent Error Rate of a Prediction Rule?
- 1 June 1986
- journal article
- research article
- Published by JSTOR in Journal of the American Statistical Association
- Vol. 81 (394) , 461
- https://doi.org/10.2307/2289236
Abstract
A regression model is fitted to an observed set of data. How accurate is the model for predicting future observations? The apparent error rate tends to underestimate the true error rate because the data have been used twice, both to fit the model and to check its accuracy. We provide simple estimates for the downward bias of the apparent error rate. The theory applies to general exponential family linear models and general measures of prediction error. Special attention is given to the case of logistic regression on binary data, with error rates measured by the proportion of misclassified cases. Several connected ideas are compared: Mallows's Cp , cross-validation, generalized cross-validation, the bootstrap, and Akaike's information criterion.Keywords
This publication has 0 references indexed in Scilit: