ACM Home Page
Please provide us with feedback. Feedback
Cooperation in stochastic games through communication
Full text PdfPdf (306 KB)
Source International Conference on Autonomous Agents archive
Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems table of contents
The Netherlands
SESSION: Posters: voting table of contents
Pages: 1197 - 1198  
Year of Publication: 2005
ISBN:1-59593-093-0
Authors
Raghav Aras  Loria \ INRIA-Lorraine, France
Alain Dutech  Loria \ INRIA-Lorraine, France
François Charpillet  Loria \ INRIA-Lorraine, France
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 1,   Downloads (12 Months): 25,   Citation Count: 0
Additional Information:

abstract   references   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1082473.1082691
What is a DOI?

ABSTRACT

The application of reinforcement learning principles to the search of equilibrium policies in stochastic games (SGs) has met with some success ([3], [4], [2]). The key insight of this approach is that each agent can learn his own ß-discounted reward equilibrium policy by keeping track of Q-values of all the agents including himself, and considering the Q-value matrix for each state as his payoff matrix. Each agent sees what actions other agents take, and what payoffs they receive. There is some evidence that in practice, agents that do not observe the actions and payoffs of other agents (hereby denoted as imperfectly observing agents), can still learn adversarial equilibrium (AE) policies in general-sum SGs ([1]) using naive Q-learning. Considering the Prisoners' Dilemma stage game (Table 1) as an abstraction of a SG, this implies that, even by ignoring other agents' play, agents still learn to play DD, which is the adversarial equilibrium joint action. The payoff received in DD can be thought of as each agent's security level.


Collaborative Colleagues:
Raghav Aras: colleagues
Alain Dutech: colleagues
François Charpillet: colleagues