Conditional Universal Consistency

    • preprint
    • Published in RePEc
Abstract
Players choose an action before learning an outcome chosen according to an unknown and history-dependent stochastic rule. Procedures that categorize outcomes, and use a randomized variation on fictitious play within each category are studied. These procedures are “conditionally consistent:†they yield almost as high a time-average payoff as if the player knew the conditional distributions of actions given categories. Moreover, given any alternative procedure, there is a conditionally consistent procedure whose performance is no more than epsilon worse regardless of the discount factor. We also discuss cycles, and argue that the time-average of play should resemble a correlated equilibrium. (This abstract was borrowed from another version of this item.)
All Related Versions

This publication has 0 references indexed in Scilit: