Using additive noise in back-propagation training
- 1 January 1992
- journal article
- research article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Neural Networks
- Vol. 3 (1) , 24-38
- https://doi.org/10.1109/72.105415
Abstract
We discuss the possibility of improving the generalization capability of a neural network by introducing additive noise to the training samples. The network considered is a feedforward layered neural network trained with the back-propagation algorithm. Back-propagation training is viewed as nonlinear least-squares regression and the additive noise is interpreted as generating a kernel estimate of the probability density that describes the training vector distribution. Two specific application types are considered: pattern classifier networks and estimation of a nonstochastic mapping from data that are corrupted by measurement errors. We do not prove that the introduction of additive noise to the training vectors always improves network generalization. However, our analysis suggests mathematically justified rules for choosing the characteristics of noise if additive noise is used in training. Further, using results of mathematical statistics we establish various asymptotic consistency results for the proposed method. We also report numerical simulations that give support to the applicability of the suggested training method.Keywords
This publication has 25 references indexed in Scilit:
- Experiments for isolated-word recognition with single- and two-layer perceptronsNeural Networks, 1990
- Inference of a rule by a neural network with thermal noisePhysical Review Letters, 1990
- Training noise adaptation in attractor neural networksJournal of Physics A: General Physics, 1990
- An Improved Version of the Pseudo-Inverse Solution for Classification and Neural NetworksEurophysics Letters, 1989
- Training with noise and the storage of correlated patterns in a neural network modelJournal of Physics A: General Physics, 1989
- An Asymptotically Efficient Solution to the Bandwidth Problem of Kernel Density EstimationThe Annals of Statistics, 1985
- Consistent Cross-Validated Density EstimationThe Annals of Statistics, 1983
- On the Effects of Dimension in Discriminant Analysis for Unequal Covariance PopulationsTechnometrics, 1979
- On the Effects of Dimension in Discriminant AnalysisTechnometrics, 1976
- Asymptotic Properties of Non-Linear Least Squares EstimatorsThe Annals of Mathematical Statistics, 1969