Statistical mechanics approach to early stopping and weight decay
- 1 July 1998
- journal article
- research article
- Published by American Physical Society (APS) in Physical Review E
- Vol. 58 (1) , 833-844
- https://doi.org/10.1103/physreve.58.833
Abstract
Overtraining as a result of the difference between the empirical loss and the expected loss is a serious problem in neural network learning. It is known that methods such as early stopping, weight decay, or input noise can reduce overtraining. Here, these methods are studied in detail. We use a model that allows an analytical treatment. The treatment is based on an equilibrium statistical mechanics approach that is extended to its finite temperature solution. An unrealizable task that shows strong overtraining is examined. We find that overtraining can be completely avoided with each of the three methods if the parameters are optimally chosen. It is also shown that overtraining can appear in a realizable task, if the task is highly nonlinear. Also there overtraining can be avoided with each of the three methods.Keywords
This publication has 17 references indexed in Scilit:
- On-line learning in soft committee machinesPhysical Review E, 1995
- Similarities of error regularization, sigmoid gain scaling, target smoothing, and training with jitterIEEE Transactions on Neural Networks, 1995
- Learning drifting concepts with neural networksJournal of Physics A: General Physics, 1993
- Statistical mechanics for neural networks with continuous-time dynamicsJournal of Physics A: General Physics, 1993
- Optimal generalization in perceptionsJournal of Physics A: General Physics, 1992
- A Practical Bayesian Framework for Backpropagation NetworksNeural Computation, 1992
- Bayesian InterpolationNeural Computation, 1992
- Generalization in a linear perceptron in the presence of noiseJournal of Physics A: General Physics, 1992
- Using additive noise in back-propagation trainingIEEE Transactions on Neural Networks, 1992
- A new look at the statistical model identificationIEEE Transactions on Automatic Control, 1974