Initializing Weights of a Multilayer Perceptron Network by Using the Orthogonal Least Squares Algorithm

1 September 1995

journal article
Published by MIT Press in Neural Computation

Vol. 7 (5) , 982-999
https://doi.org/10.1162/neco.1995.7.5.982

Abstract

Usually the training of a multilayer perceptron network starts by initializing the network weights with small random values, and then the weight adjustment is carried out by using an iterative gradient descent-based optimization routine called backpropagation training. If the random initial weights happen to be far from a good solution or they are near a poor local optimum, the training will take a lot of time since many iteration steps are required. Furthermore, it is very possible that the network will not converge to an adequate solution at all. On the other hand, if the initial weights are close to a good solution the training will be much faster and the possibility of obtaining adequate convergence increases. In this paper a new method for initializing the weights is presented. The method is based on the orthogonal least squares algorithm. The simulation results obtained with the proposed initialization method show a considerable improvement in training compared to the randomly initialized networks. In light of practical experiments, the proposed method has proven to be fast and useful for initializing the network weights.

Keywords

This publication has 6 references indexed in Scilit:

Acceleration of back propagation through initial weight pre-training with delta rule
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Statistically controlled activation weight initialization (SCAWI)
IEEE Transactions on Neural Networks, 1992
Avoiding false local minima by proper initialization of connections
IEEE Transactions on Neural Networks, 1992
Orthogonal least squares learning algorithm for radial basis function networks
IEEE Transactions on Neural Networks, 1991
Order selection for AR models by predictive least squares
IEEE Transactions on Acoustics, Speech, and Signal Processing, 1988
Increased rates of convergence through learning rate adaptation
Neural Networks, 1988