Cross-Validation, the Jackknife, and the Bootstrap: Excess Error Estimation in Forward Logistic Regression

1 March 1986

journal article
research article
Published by JSTOR in Journal of the American Statistical Association

Vol. 81 (393) , 108
https://doi.org/10.2307/2287975

Abstract

Given a prediction rule based on a set of patients, what is the probability of incorrectly predicting the outcome of a new patient? Call this probability the true error. An optimistic estimate is the apparent error, or the proportion of incorrect predictions on the original set of patients, and it is the goal of this article to study estimates of the excess error, or the difference between the true and apparent errors. I consider three estimates of the excess error: cross-validation, the jackknife, and the bootstrap. Using simulations and real data, the three estimates for a specific prediction rule are compared. When the prediction rule is allowed to be complicated, overfitting becomes a real danger, and excess error estimation becomes important. The prediction rule chosen here is moderately complicated, involving a variable-selection procedure based on forward logistic regression.

Keywords

This publication has 0 references indexed in Scilit: