Using additive noise in back-propagation training

1 January 1992

journal article
research article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Neural Networks

Vol. 3 (1) , 24-38
https://doi.org/10.1109/72.105415

Abstract

We discuss the possibility of improving the generalization capability of a neural network by introducing additive noise to the training samples. The network considered is a feedforward layered neural network trained with the back-propagation algorithm. Back-propagation training is viewed as nonlinear least-squares regression and the additive noise is interpreted as generating a kernel estimate of the probability density that describes the training vector distribution. Two specific application types are considered: pattern classifier networks and estimation of a nonstochastic mapping from data that are corrupted by measurement errors. We do not prove that the introduction of additive noise to the training vectors always improves network generalization. However, our analysis suggests mathematically justified rules for choosing the characteristics of noise if additive noise is used in training. Further, using results of mathematical statistics we establish various asymptotic consistency results for the proposed method. We also report numerical simulations that give support to the applicability of the suggested training method.

Keywords

This publication has 25 references indexed in Scilit:

Experiments for isolated-word recognition with single- and two-layer perceptrons
Neural Networks, 1990
Inference of a rule by a neural network with thermal noise
Physical Review Letters, 1990
Training noise adaptation in attractor neural networks
Journal of Physics A: General Physics, 1990
An Improved Version of the Pseudo-Inverse Solution for Classification and Neural Networks
Europhysics Letters, 1989
Training with noise and the storage of correlated patterns in a neural network model
Journal of Physics A: General Physics, 1989
An Asymptotically Efficient Solution to the Bandwidth Problem of Kernel Density Estimation
The Annals of Statistics, 1985
Consistent Cross-Validated Density Estimation
The Annals of Statistics, 1983
On the Effects of Dimension in Discriminant Analysis for Unequal Covariance Populations
Technometrics, 1979
On the Effects of Dimension in Discriminant Analysis
Technometrics, 1976
Asymptotic Properties of Non-Linear Least Squares Estimators
The Annals of Mathematical Statistics, 1969