PAC Bounds for Multi-armed Bandit and Markov Decision Processes
- 25 June 2002
- book chapter
- Published by Springer Nature
- p. 255-270
- https://doi.org/10.1007/3-540-45435-7_18
Abstract
No abstract availableKeywords
This publication has 10 references indexed in Scilit:
- Gambling in a rigged casino: The adversarial multi-armed bandit problemPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- The Nonstochastic Multiarmed Bandit ProblemSIAM Journal on Computing, 2002
- Learning Rates for Q-LearningPublished by Springer Nature ,2001
- The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement LearningSIAM Journal on Control and Optimization, 2000
- Neural Network LearningPublished by Cambridge University Press (CUP) ,1999
- PAC adaptive control of linear systemsPublished by Association for Computing Machinery (ACM) ,1997
- Asymptotically efficient adaptive allocation rulesAdvances in Applied Mathematics, 1985
- Bandit problemsPublished by Springer Nature ,1985
- Sequential Analysis and Optimal DesignPublished by Society for Industrial & Applied Mathematics (SIAM) ,1972
- Some aspects of the sequential design of experimentsBulletin of the American Mathematical Society, 1952