Abstract
Overtraining as a result of the difference between the empirical loss and the expected loss is a serious problem in neural network learning. It is known that methods such as early stopping, weight decay, or input noise can reduce overtraining. Here, these methods are studied in detail. We use a model that allows an analytical treatment. The treatment is based on an equilibrium statistical mechanics approach that is extended to its finite temperature solution. An unrealizable task that shows strong overtraining is examined. We find that overtraining can be completely avoided with each of the three methods if the parameters are optimally chosen. It is also shown that overtraining can appear in a realizable task, if the task is highly nonlinear. Also there overtraining can be avoided with each of the three methods.

This publication has 17 references indexed in Scilit: