skip to main content
10.1145/1082473.1082691acmconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
Article

Cooperation in stochastic games through communication

Published: 25 July 2005 Publication History

Abstract

The application of reinforcement learning principles to the search of equilibrium policies in stochastic games (SGs) has met with some success ([3], [4], [2]). The key insight of this approach is that each agent can learn his own ß-discounted reward equilibrium policy by keeping track of Q-values of all the agents including himself, and considering the Q-value matrix for each state as his payoff matrix. Each agent sees what actions other agents take, and what payoffs they receive. There is some evidence that in practice, agents that do not observe the actions and payoffs of other agents (hereby denoted as imperfectly observing agents), can still learn adversarial equilibrium (AE) policies in general-sum SGs ([1]) using naive Q-learning. Considering the Prisoners' Dilemma stage game (Table 1) as an abstraction of a SG, this implies that, even by ignoring other agents' play, agents still learn to play DD, which is the adversarial equilibrium joint action. The payoff received in DD can be thought of as each agent's security level.

References

[1]
M. Bowling. Convergence problems of general-sum multiagent reinforcement learning. Seventeenth International Conference on Machine Learning, pages 89--94, 2000.
[2]
A. Greenwald and K. Hall. Correlated-Q learning. Twentieth International Conference on Machine Learning, pages 242--249, 2003.
[3]
J. Hu and M. Wellman. Nash Q-learning for general-sum stochastic games. Journal of Machine Learning Research, pages 1039--1069, 2003.
[4]
M. Littman. Friend-or-foe Q-learning in general-sum games. Eighteenth International Conference on Machine Learning, pages 322--328, 2001.

Cited By

View all
  • (2010)An Algorithm for Multi-robot PlanningProceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 0210.1109/WI-IAT.2010.128(141-148)Online publication date: 31-Aug-2010
  • (2006)Partial local friendq multiagent learningProceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence10.1007/11766247_31(359-370)Online publication date: 7-Jun-2006

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
AAMAS '05: Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
July 2005
1407 pages
ISBN:1595930930
DOI:10.1145/1082473
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2005

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

AAMAS05
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2010)An Algorithm for Multi-robot PlanningProceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 0210.1109/WI-IAT.2010.128(141-148)Online publication date: 31-Aug-2010
  • (2006)Partial local friendq multiagent learningProceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence10.1007/11766247_31(359-370)Online publication date: 7-Jun-2006

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media