On the problem of local minima in recurrent neural networks
- 1 March 1994
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Neural Networks
- Vol. 5 (2) , 167-177
- https://doi.org/10.1109/72.279182
Abstract
Many researchers have recently focused their efforts on devising efficient algorithms, mainly based on optimization schemes, for learning the weights of recurrent neural networks. As in the case of feedforward networks, however, these learning algorithms may get stuck in local minima during gradient descent, thus discovering sub-optimal solutions. This paper analyses the problem of optimal learning in recurrent networks by proposing conditions that guarantee local minima free error surfaces. An example is given that also shows the constructive role of the proposed theory in designing networks suitable for solving a given task. Moreover, a formal relationship between recurrent and static feedforward networks is established such that the examples of local minima for feedforward networks already known in the literature can be associated with analogous ones in recurrent networks.Keywords
This publication has 23 references indexed in Scilit:
- Induction of Finite-State Languages Using Second-Order Recurrent NetworksNeural Computation, 1992
- Learning and Extracting Finite State Automata with Second-Order Recurrent Neural NetworksNeural Computation, 1992
- An Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network TrajectoriesNeural Computation, 1990
- Finding structure in timeCognitive Science, 1990
- Complete gradient optimization of a recurrent network applied to /b/,/d/,/g/ discriminationThe Journal of the Acoustical Society of America, 1990
- Approximation of Boolean Functions by Sigmoidal Networks: Part I: XOR and Other Two-Variable FunctionsNeural Computation, 1989
- Finite State Automata and Simple Recurrent NetworksNeural Computation, 1989
- A Learning Algorithm for Continually Running Fully Recurrent Neural NetworksNeural Computation, 1989
- Learning State Space Trajectories in Recurrent Neural NetworksNeural Computation, 1989
- Back propagation fails to separate where perceptrons succeedIEEE Transactions on Circuits and Systems, 1989