On the Gittins Index for Multiarmed Bandits

Open Access

1 November 1992

journal article
Published by Institute of Mathematical Statistics in The Annals of Applied Probability

Vol. 2 (4) , 1024-1033
https://doi.org/10.1214/aoap/1177005588

Abstract

This paper considers the multiarmed bandit problem and presents a new proof of the optimality of the Gittins index policy. The proof is intuitive and does not require an interchange argument. The insight it affords is used to give a streamlined summary of previous research and to prove a new result: The optimal value function is a submodular set function of the available projects.

Keywords

MULTIARMED BANDIT PROBLEM
STOCHASTIC SCHEDULING
MARKOV DECISION PROCESSES
OPTIMAL STOPPING
SEQUENTIAL METHODS

This publication has 0 references indexed in Scilit: