Asymptotically efficient adaptive allocation rules for the multiarmed bandit problem with switching cost

1 October 1988

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Automatic Control

Vol. 33 (10) , 899-906
https://doi.org/10.1109/9.7243

Abstract

The authors consider multiarmed bandit problems with switching cost, define uniformly good allocation rules, and restrict attention to such rules. They present a lower bound on the asymptotic performance of uniformly good allocation rules and construct an allocation scheme that achieves the bound. It is found that despite the inclusion of a switching cost the proposed allocation scheme achieves the same asymptotic performance as the optimal rule for the bandit problem without switching cost. This is made possible by grouping together samples into blocks of increasing sizes, thereby reducing the number of switches to O(log n). Finally, an optimal allocation scheme for a large class of distributions which includes members of the exponential family is illustrated.

Keywords

This publication has 5 references indexed in Scilit:

Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part I: I.I.D. rewards
IEEE Transactions on Automatic Control, 1987
Asymptotically efficient adaptive allocation rules
Advances in Applied Mathematics, 1985
Sequential Analysis
Published by Springer Nature ,1985
Some thoughts on stochastic adaptive control
Published by Institute of Electrical and Electronics Engineers (IEEE) ,1984
Some aspects of the sequential design of experiments
Bulletin of the American Mathematical Society, 1952