Learning Without State-Estimation in Partially Observable Markovian Decision Processes

Publisher Website

1 January 1994

book chapter
Published by Elsevier

p. 284-292
https://doi.org/10.1016/b978-1-55860-335-6.50042-8

Abstract

No abstract available

This publication has 8 references indexed in Scilit:

A Reinforcement Learning Method for Maximizing Undiscounted Rewards
Published by Elsevier ,1993
The Convergence of TD(λ) for General λ
Machine Learning, 1992
Technical Note: Q-Learning
Machine Learning, 1992
Active Perception and Reinforcement Learning
Published by Elsevier ,1990
Learning to predict by the methods of temporal differences
Machine Learning, 1988
Pattern-recognizing stochastic learning automata
IEEE Transactions on Systems, Man, and Cybernetics, 1985
The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs
Operations Research, 1978
Learning Automata - A Survey
IEEE Transactions on Systems, Man, and Cybernetics, 1974