An optimal sequential policy for controlling a Markov renewal process
- 1 June 1985
- journal article
- Published by Cambridge University Press (CUP) in Journal of Applied Probability
- Vol. 22 (2) , 324-335
- https://doi.org/10.2307/3213776
Abstract
This paper discusses a renewal process whose time development between renewals is described by a Markov process. The process may be controlled by choosing the times at which renewal occurs, the objective of the control being to maximise the long-term average rate of reward. Let γ ∗ denote the maximum achievable rate. We consider a specific policy in which a sequence of estimates of γ ∗ is made. This sequence is defined inductively as follows. Initially an (a priori)estimate γo is chosen. On making the nth renewal one estimates γ ∗ in terms of γo, the total rewards obtained in the first n renewal cycles and the total length of these cycles. γ n then determines the length of the (n + 1)th cycle. It is shown that γ n tends to γ ∗ as n tends to∞, and that this policy is optimal.The time at which the (n + 1)th renewal is made is determined by solving a stopping problem for the Markov process with continuation cost γ n per unit time and stopping reward equal to the renewal reward. Thus, in general, implementation of this policy requires a knowledge of the transition probabilities of the Markov process. An example is presented in which one needs to know essentially nothing about the details of this process or the fine details of the reward structure in order to implement the policy. The example is based on a problem in biology.Keywords
This publication has 11 references indexed in Scilit:
- Optimal foraging, the marginal value theoremPublished by Elsevier ,2004
- Downy Woodpecker Foraging Behavior: Efficient Sampling in Simple Stochastic EnvironmentsEcology, 1984
- Optimal patch use in a stochastic environmentTheoretical Population Biology, 1982
- Prey Distribution as a Factor Determining the Choice of Optimal Foraging StrategyThe American Naturalist, 1981
- Bayesian birds: A simple example of Oaten's stochastic model of optimal foragingTheoretical Population Biology, 1980
- Optimal foraging in patches: A case for stochasticityTheoretical Population Biology, 1977
- Coevolution of Foraging in Bombus and Nectar Dispensing in Chilopsis : A Last Dreg TheoryScience, 1977
- Optimal foraging in great tits (Parus major)Nature, 1977
- Hunting by expectation or optimal foraging? A study of patch use by chickadeesAnimal Behaviour, 1974
- Average Renewal Loss RatesThe Annals of Mathematical Statistics, 1963