A Bound on the Error of Cross Validation Using the Approximation and Estimation Rates, with Consequences for the Training-Test Split

1 July 1997

journal article
Published by MIT Press in Neural Computation

Vol. 9 (5) , 1143-1161
https://doi.org/10.1162/neco.1997.9.5.1143

Abstract

We give a theoretical and experimental analysis of the generalization error of cross validation using two natural measures of the problem under consideration. The approximation rate measures the accuracy to which the target function can be ideally approximated as a function of the number of parameters, and thus captures the complexity of the target function with respect to the hypothesis model. The estimation rate measures the deviation between the training and generalization errors as a function of the number of parameters, and thus captures the extent to which the hypothesis model suffers from overfitting. Using these two measures, we give a rigorous and general bound on the error of the simplest form of cross validation. The bound clearly shows the dangers of making γ —the fraction of data saved for testing—too large or too small. By optimizing the bound with respect to γ, we then argue that the following qualitative properties of cross-validation behavior should be quite robust to significant changes in the underlying model selection problem: When the target function complexity is small compared to the sample size, the performance of cross validation is relatively insensitive to the choice of γ. The importance of choosing γ optimally increases, and the optimal value for γ decreases, as the target function becomes more complex relative to the sample size. There is nevertheless a single fixed value for γ that works nearly optimally for a wide range of target function complexity.

Keywords

This publication has 7 references indexed in Scilit:

Statistical mechanics of learning from examples
Physical Review A, 1992
Minimum complexity density estimation
IEEE Transactions on Information Theory, 1991
Automatic pattern recognition: a study of the probability of error
Published by Institute of Electrical and Electronics Engineers (IEEE) ,1988
Learning representations by back-propagating errors
Nature, 1986
Asymptotics for and against cross-validation
Biometrika, 1977
Cross-Validatory Choice and Assessment of Statistical Predictions
Journal of the Royal Statistical Society Series B: Statistical Methodology, 1974
On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities
Theory of Probability and Its Applications, 1971