PAC Bounds for Multi-armed Bandit and Markov Decision Processes

No abstract available

This publication has 10 references indexed in Scilit:

Gambling in a rigged casino: The adversarial multi-armed bandit problem
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
The Nonstochastic Multiarmed Bandit Problem
SIAM Journal on Computing, 2002
Learning Rates for Q-Learning
Published by Springer Nature ,2001
The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
SIAM Journal on Control and Optimization, 2000
Neural Network Learning
Published by Cambridge University Press (CUP) ,1999
PAC adaptive control of linear systems
Published by Association for Computing Machinery (ACM) ,1997
Asymptotically efficient adaptive allocation rules
Advances in Applied Mathematics, 1985
Bandit problems
Published by Springer Nature ,1985
Sequential Analysis and Optimal Design
Published by Society for Industrial & Applied Mathematics (SIAM) ,1972
Some aspects of the sequential design of experiments
Bulletin of the American Mathematical Society, 1952