Training with Noise is Equivalent to Tikhonov Regularization
- 1 January 1995
- journal article
- Published by MIT Press in Neural Computation
- Vol. 7 (1) , 108-116
- https://doi.org/10.1162/neco.1995.7.1.108
Abstract
It is well known that the addition of noise to the input data of a neural network during training can, in some circumstances, lead to significant improvements in generalization performance. Previous work has shown that such training with noise is equivalent to a form of regularization in which an extra term is added to the error function. However, the regularization term, which involves second derivatives of the error function, is not bounded below, and so can lead to difficulties if used directly in a learning algorithm based on error minimization. In this paper we show that for the purposes of network training, the regularization term can be reduced to a positive semi-definite form that involves only first derivatives of the network mapping. For a sum-of-squares error function, the regularization term belongs to the class of generalized Tikhonov regularizers. Direct minimization of the regularized error function provides a practical alternative to training with noise.Keywords
This publication has 9 references indexed in Scilit:
- Fast Exact Multiplication by the HessianNeural Computation, 1994
- Curvature-driven smoothing: a learning algorithm for feedforward networksIEEE Transactions on Neural Networks, 1993
- A scaled conjugate gradient algorithm for fast supervised learningNeural Networks, 1993
- Exact Calculation of the Hessian Matrix for the Multilayer PerceptronNeural Computation, 1992
- A Practical Bayesian Framework for Backpropagation NetworksNeural Computation, 1992
- Neural Networks and the Bias/Variance DilemmaNeural Computation, 1992
- Improving the Generalization Properties of Radial Basis Function Neural NetworksNeural Computation, 1991
- Creating artificial neural networks that generalizeNeural Networks, 1991
- Prior ProbabilitiesIEEE Transactions on Systems Science and Cybernetics, 1968