Open bandit processes and optimal scheduling of queueing networks

1 March 1988

journal article
research article
Published by Cambridge University Press (CUP) in Advances in Applied Probability

Vol. 20 (02) , 447-472
https://doi.org/10.1017/S0001867800017067

Abstract

Asymptotic approximations are developed herein for the optimal policies in discounted multi-armed bandit problems in which new projects are continually appearing, commonly known as ‘open bandit problems’ or ‘arm-acquiring bandits’. It is shown that under certain stability assumptions the open bandit problem is asymptotically equivalent to a closed bandit problem in which there is no arrival of new projects, as the discount factor approaches 1. Applications of these results to optimal scheduling of queueing networks are given. In particular, Klimov&s priority indices for scheduling queueing networks are shown to be limits of the Gittins indices for the associated closed bandit problem, and extensions of Klimov&s results to preemptive policies and to unstable queueing systems are given.

Keywords

This publication has 7 references indexed in Scilit:

Discrete multi-armed bandits and multi-parameter processes
Probability Theory and Related Fields, 1986
Extensions of the multiarmed bandit problem: The discounted case
IEEE Transactions on Automatic Control, 1985
Arm-Acquiring Bandits
The Annals of Probability, 1981
Time-Sharing Service Systems. II
Theory of Probability and Its Applications, 1979
On Bayesian models in stochastic scheduling
Journal of Applied Probability, 1977
Optimal Control of Single-Server Queuing Networks and Multi-Class M/G/1 Queues with Feedback
Operations Research, 1977
Time-Sharing Service Systems. I
Theory of Probability and Its Applications, 1975