Adaptive critic designs
- 1 September 1997
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Neural Networks
- Vol. 8 (5) , 997-1007
- https://doi.org/10.1109/72.623201
Abstract
We discuss a variety of adaptive critic designs (ACDs) for neurocontrol. These are suitable for learning in noisy, nonlinear, and nonstationary environments. They have common roots as generalizations of dynamic programming for neural reinforcement learning approaches. Our discussion of these origins leads to an explanation of three design families: heuristic dynamic programming, dual heuristic programming, and globalized dual heuristic programming (GDHP). The main emphasis is on DHP and GDHP as advanced ACDs. We suggest two new modifications of the original GDHP design that are currently the only working implementations of GDHP. They promise to be useful for many engineering applications in the areas of optimization and optimal control. Based on one of these modifications, we present a unified approach to all ACDs. This leads to a generalized training procedure for ACDs.Keywords
This publication has 26 references indexed in Scilit:
- Applications of advances in nonlinear sensitivity analysisPublished by Springer Nature ,2005
- Optimal neurocontrol: practical benefits, new results and biological evidencePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Dynamic neural network methods applied to on-vehicle idle speed controlProceedings of the IEEE, 1996
- Explanation-Based Neural Network LearningPublished by Springer Nature ,1996
- Adaptive critic designs: A case study for neurocontrolNeural Networks, 1995
- Neurocontrol of nonlinear dynamical systems with Kalman filter trained recurrent networksIEEE Transactions on Neural Networks, 1994
- Advantage UpdatingPublished by Defense Technical Information Center (DTIC) ,1993
- Q-learningMachine Learning, 1992
- Identification and control of dynamical systems using neural networksIEEE Transactions on Neural Networks, 1990
- A Learning Algorithm for Continually Running Fully Recurrent Neural NetworksNeural Computation, 1989