Team-partitioned, opaque-transition reinforcement learning
- 1 April 1999
- proceedings article
- Published by Association for Computing Machinery (ACM)
- p. 206-212
- https://doi.org/10.1145/301136.301195
Abstract
In this paper, we present a novel multi-agent learning paradigm called team-partitioned, opaque-transition rein- forcement learning (TPOT-RL). TPOT-RL introduces the concept of using action-dependent features to generalize the state space. In our work, we use a learned action-dependent feature space. TPOT-RL is an effective technique to allow a team of agents to learn to cooperate towards the achievement of a specific goal. It is an adaptation of traditional RL methods that is applicable in complex, non-Markovian, multi-agent domains with large state spaces and limited training opportunities. Multi-agent scenarios are opaque-transition, as team members are not always in full communication with one another and adversaries may affect the environment. Hence, each learner cannot rely on having knowledge of future state transitions after acting in the world. TPOT-RL enables teams of agents to learn effective policies with very few training examples even in the face of a large state space with large amounts of hidden state. The main responsible features are: dividing the learning task among team members, using a very coarse, action-dependent feature space, and allowing agents to gather reinforcement directly from observation of the environment. TPOT-RL is fully implemented and has been tested in the robotic soccer domain, a complex, multi-agent framework. This paper presents the algorithmic details of TPOT-RL as well as empirical results demonstrating the effectiveness of the developed multi-agent learning approach with learned features.Keywords
This publication has 5 references indexed in Scilit:
- Layered approach to learning client behaviors in the robocup soccer serverApplied Artificial Intelligence, 1998
- Towards collaborative and adversarial learning: a case study in robotic soccerInternational Journal of Human-Computer Studies, 1998
- Using decision tree confidence factors for multi-agent controlPublished by Association for Computing Machinery (ACM) ,1998
- Learning Team Strategies: Soccer Case StudiesMachine Learning, 1998
- Purposive behavior acquisition for a real robot by vision-based reinforcement learningMachine Learning, 1996