Learning Without State-Estimation in Partially Observable Markovian Decision Processes
- 1 January 1994
- book chapter
- Published by Elsevier
Abstract
No abstract availableThis publication has 8 references indexed in Scilit:
- A Reinforcement Learning Method for Maximizing Undiscounted RewardsPublished by Elsevier ,1993
- The Convergence of TD(λ) for General λMachine Learning, 1992
- Technical Note: Q-LearningMachine Learning, 1992
- Active Perception and Reinforcement LearningPublished by Elsevier ,1990
- Learning to predict by the methods of temporal differencesMachine Learning, 1988
- Pattern-recognizing stochastic learning automataIEEE Transactions on Systems, Man, and Cybernetics, 1985
- The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted CostsOperations Research, 1978
- Learning Automata - A SurveyIEEE Transactions on Systems, Man, and Cybernetics, 1974