Two-armed bandits with a goal, II. Dependent arms
- 1 December 1980
- journal article
- Published by Cambridge University Press (CUP) in Advances in Applied Probability
- Vol. 12 (4) , 958-971
- https://doi.org/10.2307/1426751
Abstract
One of two random variables, X and Y, can be selected at each of a possibly infinite number of stages. Depending on the outcome, one's fortune is either increased or decreased by 1. The probability of increase may not be known for either X or Y. The objective is to increase one's fortune to G before it decreases to g, for some integral g and G; either may be infinite.In Part I (Berry and Fristedt (1980)), the distribution of X is unknown and that of Y is known. In the current part, it is known that either X or Y has probability α of increasing the current fortune by 1 and the other has probability β of increasing the fortune by 1, where α and β are known, but which goes with X is not known. We show that optimal strategies exist in general and find all optimal schemes when α = 0 and when α + β = 1. In both cases myopic strategies are shown to be optimal. A counterexample is used to show that myopic strategies, while intuitively very appealing, are not optimal for general (α, β).Keywords
This publication has 5 references indexed in Scilit:
- Two-armed bandits with a goal, I. One arm knownAdvances in Applied Probability, 1980
- A Note on the Bernoulli Two-Armed Bandit ProblemThe Annals of Statistics, 1974
- A Bernoulli Two-armed BanditThe Annals of Mathematical Statistics, 1972
- Some Remarks on the Two-Armed BanditThe Annals of Mathematical Statistics, 1970
- Contributions to the "Two-Armed Bandit" ProblemThe Annals of Mathematical Statistics, 1962