Finite-Time Performance of Some Two-Armed Bandit Controllers
- 1 March 1973
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Systems, Man, and Cybernetics
- Vol. SMC-3 (2) , 194-197
- https://doi.org/10.1109/tsmc.1973.5408504
Abstract
A class of asymptotically ε-optimal two-armed bandit controllers is given, and two criteria for comparing thelong-term finite-time performance of controllers in this class are proposed. The performances of three particular controllers are compared using the criteria, and the analysis is confirmed by computer iteration if the appropriate probability recurrence relations.Keywords
This publication has 4 references indexed in Scilit:
- A Note on the Linear Reinforcement Scheme for Variable-Structure Stochastic AutomataIEEE Transactions on Systems, Man, and Cybernetics, 1972
- The two-armed-bandit problem with time-invariant finite memoryIEEE Transactions on Information Theory, 1970
- Stochastic Computing SystemsPublished by Springer Nature ,1969
- Use of Stochastic Automata for Parameter Self-Optimization with Multimodal Performance CriteriaIEEE Transactions on Systems Science and Cybernetics, 1969