Discrete recurrent neural networks for grammatical inference

1 March 1994

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Neural Networks

Vol. 5 (2) , 320-330
https://doi.org/10.1109/72.279194

Abstract

Describes a novel neural architecture for learning deterministic context-free grammars, or equivalently, deterministic pushdown automata. The unique feature of the proposed network is that it forms stable state representations during learning-previous work has shown that conventional analog recurrent networks can be inherently unstable in that they cannot retain their state memory for long input strings. The authors have previously introduced the discrete recurrent network architecture for learning finite-state automata. Here they extend this model to include a discrete external stack with discrete symbols. A composite error function is described to handle the different situations encountered in learning. The pseudo-gradient learning method (introduced in previous work) is in turn extended for the minimization of these error functions. Empirical trials validating the effectiveness of the pseudo-gradient learning method are presented, for networks both with and without an external stack. Experimental results show that the new networks are successful in learning some simple pushdown automata, though overfitting and non-convergent learning can also occur. Once learned, the internal representation of the network is provably stable; i.e., it classifies unseen strings of arbitrary length with 100% accuracy.

Keywords

This publication has 8 references indexed in Scilit:

Learning Finite State Machines With Self-Clustering Recurrent Networks
Neural Computation, 1993
Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks
Neural Computation, 1992
The induction of dynamical recognizers
Machine Learning, 1991
Distributed representations, simple recurrent networks, and grammatical structure
Machine Learning, 1991
Graded state machines: The representation of temporal contingencies in simple recurrent networks
Machine Learning, 1991
Finding structure in time
Cognitive Science, 1990
Finite State Automata and Simple Recurrent Networks
Neural Computation, 1989
A Learning Algorithm for Continually Running Fully Recurrent Neural Networks
Neural Computation, 1989