Adaptive critic designs

1 September 1997

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Neural Networks

Vol. 8 (5) , 997-1007
https://doi.org/10.1109/72.623201

Abstract

We discuss a variety of adaptive critic designs (ACDs) for neurocontrol. These are suitable for learning in noisy, nonlinear, and nonstationary environments. They have common roots as generalizations of dynamic programming for neural reinforcement learning approaches. Our discussion of these origins leads to an explanation of three design families: heuristic dynamic programming, dual heuristic programming, and globalized dual heuristic programming (GDHP). The main emphasis is on DHP and GDHP as advanced ACDs. We suggest two new modifications of the original GDHP design that are currently the only working implementations of GDHP. They promise to be useful for many engineering applications in the areas of optimization and optimal control. Based on one of these modifications, we present a unified approach to all ACDs. This leads to a generalized training procedure for ACDs.

Keywords

This publication has 26 references indexed in Scilit:

Applications of advances in nonlinear sensitivity analysis
Published by Springer Nature ,2005
Optimal neurocontrol: practical benefits, new results and biological evidence
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Dynamic neural network methods applied to on-vehicle idle speed control
Proceedings of the IEEE, 1996
Explanation-Based Neural Network Learning
Published by Springer Nature ,1996
Adaptive critic designs: A case study for neurocontrol
Neural Networks, 1995
Neurocontrol of nonlinear dynamical systems with Kalman filter trained recurrent networks
IEEE Transactions on Neural Networks, 1994
Advantage Updating
Published by Defense Technical Information Center (DTIC) ,1993
Q-learning
Machine Learning, 1992
Identification and control of dynamical systems using neural networks
IEEE Transactions on Neural Networks, 1990
A Learning Algorithm for Continually Running Fully Recurrent Neural Networks
Neural Computation, 1989