Linear-Least-Squares Initialization of Multilayer Perceptrons Through Backpropagation of the Desired Response
- 7 March 2005
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Neural Networks
- Vol. 16 (2) , 325-337
- https://doi.org/10.1109/tnn.2004.841777
Abstract
Training multilayer neural networks is typically carried out using descent techniques such as the gradient-based backpropagation (BP) of error or the quasi-Newton approaches including the Levenberg-Marquardt algorithm. This is basically due to the fact that there are no analytical methods to find the optimal weights, so iterative local or global optimization techniques are necessary. The success of iterative optimization procedures is strictly dependent on the initial conditions, therefore, in this paper, we devise a principled novel method of backpropagating the desired response through the layers of a multilayer perceptron (MLP), which enables us to accurately initialize these neural networks in the minimum mean-square-error sense, using the analytic linear least squares solution. The generated solution can be used as an initial condition to standard iterative optimization algorithms. However, simulations demonstrate that in most cases, the performance achieved through the proposed initialization scheme leaves little room for further improvement in the mean-square-error (MSE) over the training set. In addition, the performance of the network optimized with the proposed approach also generalizes well to testing data. A rigorous derivation of the initialization algorithm is presented and its high performance is verified with a number of benchmark training problems including chaotic time-series prediction, classification, and nonlinear system identification with MLPs.Keywords
This publication has 31 references indexed in Scilit:
- Training neural networks with additive noise in the desired signalIEEE Transactions on Neural Networks, 1999
- A new method in determining initial weights of feedforward neural networks for training enhancementNeurocomputing, 1997
- Alternative neural network training methods [active sonar processing]IEEE Expert, 1995
- Computing second derivatives in feed-forward networks: a reviewIEEE Transactions on Neural Networks, 1994
- Training feedforward networks with the Marquardt algorithmIEEE Transactions on Neural Networks, 1994
- A learning algorithm for multilayered neural networks based on linear least squares problemsNeural Networks, 1993
- Statistically controlled activation weight initialization (SCAWI)IEEE Transactions on Neural Networks, 1992
- First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's MethodNeural Computation, 1992
- Experiments in nonconvex optimization: Stochastic approximation with function smoothing and simulated annealingNeural Networks, 1990
- The perceptron: A probabilistic model for information storage and organization in the brain.Psychological Review, 1958