On the problem of local minima in recurrent neural networks

1 March 1994

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Neural Networks

Vol. 5 (2) , 167-177
https://doi.org/10.1109/72.279182

Abstract

Many researchers have recently focused their efforts on devising efficient algorithms, mainly based on optimization schemes, for learning the weights of recurrent neural networks. As in the case of feedforward networks, however, these learning algorithms may get stuck in local minima during gradient descent, thus discovering sub-optimal solutions. This paper analyses the problem of optimal learning in recurrent networks by proposing conditions that guarantee local minima free error surfaces. An example is given that also shows the constructive role of the proposed theory in designing networks suitable for solving a given task. Moreover, a formal relationship between recurrent and static feedforward networks is established such that the examples of local minima for feedforward networks already known in the literature can be associated with analogous ones in recurrent networks.

Keywords

This publication has 23 references indexed in Scilit:

Induction of Finite-State Languages Using Second-Order Recurrent Networks
Neural Computation, 1992
Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks
Neural Computation, 1992
An Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network Trajectories
Neural Computation, 1990
Finding structure in time
Cognitive Science, 1990
Complete gradient optimization of a recurrent network applied to /b/,/d/,/g/ discrimination
The Journal of the Acoustical Society of America, 1990
Approximation of Boolean Functions by Sigmoidal Networks: Part I: XOR and Other Two-Variable Functions
Neural Computation, 1989
Finite State Automata and Simple Recurrent Networks
Neural Computation, 1989
A Learning Algorithm for Continually Running Fully Recurrent Neural Networks
Neural Computation, 1989
Learning State Space Trajectories in Recurrent Neural Networks
Neural Computation, 1989
Back propagation fails to separate where perceptrons succeed
IEEE Transactions on Circuits and Systems, 1989