The Effects of Adding Noise During Backpropagation Training on a Generalization Performance
- 1 April 1996
- journal article
- Published by MIT Press in Neural Computation
- Vol. 8 (3) , 643-674
- https://doi.org/10.1162/neco.1996.8.3.643
Abstract
We study the effects of adding noise to the inputs, outputs, weight connections, and weight changes of multilayer feedforward neural networks during backpropagation training. We rigorously derive and analyze the objective functions that are minimized by the noise-affected training processes. We show that input noise and weight noise encourage the neural-network output to be a smooth function of the input or its weights, respectively. In the weak-noise limit, noise added to the output of the neural networks only changes the objective function by a constant. Hence, it cannot improve generalization. Input noise introduces penalty terms in the objective function that are related to, but distinct from, those found in the regularization approaches. Simulations have been performed on a regression and a classification problem to further substantiate our analysis. Input noise is found to be effective in improving the generalization performance for both problems. However, weight noise is found to be effective in improving the generalization performance only for the classification problem. Other forms of noise have practically no effect on generalization.Keywords
This publication has 17 references indexed in Scilit:
- Training with Noise is Equivalent to Tikhonov RegularizationNeural Computation, 1995
- Synaptic weight noise during multilayer perceptron training: fault tolerance and training improvementsIEEE Transactions on Neural Networks, 1993
- Optimal Network Construction by Minimum Description LengthNeural Computation, 1993
- Bayesian InterpolationNeural Computation, 1992
- Using additive noise in back-propagation trainingIEEE Transactions on Neural Networks, 1992
- Improving generalization performance using double backpropagationIEEE Transactions on Neural Networks, 1992
- A stochastic version of the delta rulePhysica D: Nonlinear Phenomena, 1990
- Phase transitions in simple learningJournal of Physics A: General Physics, 1989
- Diffusions for Global OptimizationSIAM Journal on Control and Optimization, 1986
- Optimization by Simulated AnnealingScience, 1983