New results on recurrent network training: unifying the algorithms and accelerating convergence
Top Cited Papers
- 1 May 2000
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Neural Networks
- Vol. 11 (3) , 697-709
- https://doi.org/10.1109/72.846741
Abstract
How to efficiently train recurrent networks remains a challenging and active research topic. Most of the proposed training approaches are based on computational ways to efficiently obtain the gradient of the error function, and can be generally grouped into five major groups. In this study we present a derivation that unifies these approaches. We demonstrate that the approaches are only five different ways of solving a particular matrix equation. The second goal of this paper is develop a new algorithm based on the insights gained from the novel formulation. The new algorithm, which is based on approximating the error gradient, has lower computational complexity in computing the weight update than the competing techniques for most typical problems. In addition, it reaches the error minimum in a much smaller number of iterations. A desirable characteristic of recurrent network training algorithms is to be able to update the weights in an online fashion. We have also developed an online version of the proposed algorithm, that is based on updating the error gradient approximation in a recursive manner.Keywords
This publication has 32 references indexed in Scilit:
- Recurrent multilayer perceptron for nonlinear system identificationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Long Short-Term MemoryNeural Computation, 1997
- Diagrammatic Derivation of Gradient Algorithms for Neural NetworksNeural Computation, 1996
- Gradient calculations for dynamic recurrent neural networks: a surveyIEEE Transactions on Neural Networks, 1995
- Identification of nonlinear dynamics using a general spatio-temporal networkMathematical and Computer Modelling, 1995
- Gradient descent learning algorithm overview: a general dynamical systems perspectiveIEEE Transactions on Neural Networks, 1995
- Locally recurrent globally feedforward networks: a critical review of architecturesIEEE Transactions on Neural Networks, 1994
- Application of the recurrent multilayer perceptron in modeling complex process dynamicsIEEE Transactions on Neural Networks, 1994
- Relating Real-Time Backpropagation and Backpropagation-Through-Time: An Application of Flow Graph InterreciprocityNeural Computation, 1994
- Dynamics and architecture for neural computationJournal of Complexity, 1988