Dynamic learning rate optimization of the backpropagation algorithm

1 May 1995

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Neural Networks

Vol. 6 (3) , 669-677
https://doi.org/10.1109/72.377972

Abstract

It has been observed by many authors that the backpropagation (BP) error surfaces usually consist of a large amount of flat regions as well as extremely steep regions. As such, the BP algorithm with a fixed learning rate will have low efficiency. This paper considers dynamic learning rate optimization of the BP algorithm using derivative information. An efficient method of deriving the first and second derivatives of the objective function with respect to the learning rate is explored, which does not involve explicit calculation of second-order derivatives in weight space, but rather uses the information gathered from the forward and backward propagation, Several learning rate optimization approaches are subsequently established based on linear expansion of the actual outputs and line searches with acceptable descent value and Newton-like methods, respectively. Simultaneous determination of the optimal learning rate and momentum is also introduced by showing the equivalence between the momentum version BP and the conjugate gradient method. Since these approaches are constructed by simple manipulations of the obtained derivatives, the computational and storage burden scale with the network size exactly like the standard BP algorithm, and the convergence of the BP algorithm is accelerated with in a remarkable reduction (typically by factor 10 to 50, depending upon network architectures and applications) in the running time for the overall learning process. Numerous computer simulation results are provided to support the present approaches.

Keywords

This publication has 17 references indexed in Scilit:

Training feed-forward networks with the extended Kalman algorithm
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2003
Decoupled extended Kalman filter training of feedforward layered networks
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Error surfaces for multi-layer perceptrons
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
A scaled conjugate gradient algorithm for fast supervised learning
Neural Networks, 1993
Improving the convergence of the back-propagation algorithm
Neural Networks, 1992
Can backpropagation error surface not have local minima
IEEE Transactions on Neural Networks, 1992
Training algorithms for backpropagation neural networks with optimal descent factor
Electronics Letters, 1990
30 years of adaptive neural networks: perceptron, Madaline, and backpropagation
Proceedings of the IEEE, 1990
Accelerating the convergence of the back-propagation method
Biological Cybernetics, 1988
Increased rates of convergence through learning rate adaptation
Neural Networks, 1988