skip to main content
10.5555/1838206.1838250acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
research-article

Frequency adjusted multi-agent Q-learning

Published: 10 May 2010 Publication History

Abstract

Multi-agent learning is a crucial method to control or find solutions for systems, in which more than one entity needs to be adaptive. In today's interconnected world, such systems are ubiquitous in many domains, including auctions in economics, swarm robotics in computer science, and politics in social sciences. Multi-agent learning is inherently more complex than single-agent learning and has a relatively thin theoretical framework supporting it. Recently, multi-agent learning dynamics have been linked to evolutionary game theory, allowing the interpretation of learning as an evolution of competing policies in the mind of the learning agents. The dynamical system from evolutionary game theory that has been linked to Q-learning predicts the expected behavior of the learning agents. Closer analysis however allows for two interesting observations: the predicted behavior is not always the same as the actual behavior, and in case of deviation, the predicted behavior is more desirable. This discrepancy is elucidated in this article, and based on these new insights Frequency Adjusted Q- (FAQ-) learning is proposed. This variation of Q-learning perfectly adheres to the predictions of the evolutionary model for an arbitrarily large part of the policy space. In addition to the theoretical discussion, experiments in the three classes of two-agent two-action games illustrate the superiority of FAQ-learning.

References

[1]
Dan Ariely. Predictably Irrational: The Hidden Forces That Shape Our Decisions. HarperCollins, February 2008.
[2]
Monica Babes, Michael Wunder, and Michael Littman. Q-learning in two-player two-action games. In Autonomous Learning Agents Workshop at the 8th Int. Conf. on Autonomous Agents and Multiagent Systems (AAMAS 2009), May 2009.
[3]
T. Börgers and R. Sarin. Learning through reinforcement and replicator dynamics. Journal of Economic Theory, 77(1), November 1997.
[4]
Michael Bowling. Convergence problems of general-sum multiagent reinforcement learning. In In Proceedings of the Seventeenth International Conference on Machine Learning, pages 89--94. Morgan Kaufmann, 2000.
[5]
Michael Bowling and Manuela Veloso. Multiagent learning using a variable learning rate. Artificial Intelligence, 136:215--250, 2002.
[6]
Bruce Bueno de Mesquita. Game theory, political economy, and the evolving study of war and peace. American Political Science Review, 100(4):637--642, November 2006.
[7]
C. M. Gintis. Game Theory Evolving. University Press, Princeton, June 2000.
[8]
Eduardo Gomes and Ryszard Kowalczyk. Modelling the dynamics of multiagent q-learning with ε-greedy exploration (short paper). In Sierra Decker, Sichman and Castelfranchi, editors, Proc. of 8th Int. Conf. on Autonomous Agents and Multiagent Systems (AAMAS 2009), pages 1181--1182, Budapest, Hungary, May 10--15, 2009.
[9]
Morris W. Hirsch, Stephen Smale, and Robert Devaney. Differential Equations, Dynamical Systems, and an Introduction to Chaos. Academic Press, 2002.
[10]
Josef Hofbauer and Karl Sigmund. Evolutionary Games and Population Dynamics. Cambridge University Press, 2002.
[11]
Shlomit Hon-Snir, Dov Monderer, and Aner Sela. A learning approach to auctions. Journal of Economic Theory, 82:65--88, November 1998.
[12]
Michael Kaisers, Karl Tuyls, Simon Parsons, and Frank Thuijsman. An evolutionary model of multi-agent learning with a varying exploration rate. In AAMAS '09: Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems, pages 1255--1256, Richland, SC, 2009. International Foundation for Autonomous Agents and Multiagent Systems.
[13]
J. Maynard-Smith. Evolution and the Theory of Games. Cambridge University Press, December 1982.
[14]
Shervin Nouyan, Roderich Groß, Michael Bonani, Francesco Mondada, and Marco Dorigo. Teamwork in self-organized robot colonies. Transactions on Evolutionary Computation, 13(4):695--711, August 2009.
[15]
L. Panait and S. Luke. Cooperative multi-agent learning: The state of the art. Autonomous Agents and Multi-Agent Systems, 11(3):387--434, November 2005.
[16]
S. Phelps, M. Marcinkiewicz, and S. Parsons. A novel method for automatic strategy acquisition in n-player non-zero-sum games. In AAMAS '06: Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, pages 705--712, Hakodate, Japan, 2006. ACM.
[17]
Y. Shoham, R. Powers, and T. Grenager. If multi-agent learning is the answer, what is the question? Artificial Intelligence, 171(7):365--377, 2007.
[18]
R. Sutton and A. Barto. Reinforcement Learning: An introduction. MA: MIT Press, Cambridge, 1998.
[19]
P. D. Taylor and L. Jonker. Evolutionarily stable strategies and game dynamics. Mathematical Biosciences, 40:145--156, 1978.
[20]
K. Tuyls and S. Parsons. What evolutionary game theory tells us about multiagent learning. Artificial Intelligence, 171(7):406--416, 2007.
[21]
K. Tuyls, P. J. 't Hoen, and B. Vanschoenwinkel. An evolutionary dynamical analysis of multi-agent learning in iterated games. Autonomous Agents and Multi-Agent Systems, 12:115--153, 2005.
[22]
Karl Tuyls, Katja Verbeeck, and Tom Lenaerts. A selection-mutation model for q-learning in multi-agent systems. In AAMAS '03: Proceedings of the second international joint conference on Autonomous agents and multiagent systems, pages 693--700, New York, NY, USA, 2003. ACM.
[23]
C. J. C. H. Watkins and P. Dayan. Q-learning. Machine Learning, 8(3):279--292, 1992.

Cited By

View all
  • (2021)Exploration-exploitation in multi-agent competitionProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3542276(26318-26331)Online publication date: 6-Dec-2021
  • (2019)Expertise drift in referral networksAutonomous Agents and Multi-Agent Systems10.1007/s10458-019-09419-933:5(645-671)Online publication date: 1-Sep-2019
  • (2018)Faster Policy Adaptation in Environments with ExogeneityProceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3237383.3237851(1035-1043)Online publication date: 9-Jul-2018
  • Show More Cited By

Index Terms

  1. Frequency adjusted multi-agent Q-learning

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    AAMAS '10: Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
    May 2010
    1578 pages
    ISBN:9780982657119

    Sponsors

    • IFAAMAS

    In-Cooperation

    Publisher

    International Foundation for Autonomous Agents and Multiagent Systems

    Richland, SC

    Publication History

    Published: 10 May 2010

    Check for updates

    Author Tags

    1. Q-learning
    2. evolutionary game theory
    3. multi-agent learning
    4. replicator dynamics

    Qualifiers

    • Research-article

    Conference

    AAMAS '10
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)13
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 11 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)Exploration-exploitation in multi-agent competitionProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3542276(26318-26331)Online publication date: 6-Dec-2021
    • (2019)Expertise drift in referral networksAutonomous Agents and Multi-Agent Systems10.1007/s10458-019-09419-933:5(645-671)Online publication date: 1-Sep-2019
    • (2018)Faster Policy Adaptation in Environments with ExogeneityProceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3237383.3237851(1035-1043)Online publication date: 9-Jul-2018
    • (2015)Optimality and Equilibrium of Exploration Ratio for Multiagent Learning in Nonstationary EnvironmentsRevised Selected Papers of the International Workshop on Multi-Agent Based Simulation XVI - Volume 956810.1007/978-3-319-31447-1_11(159-172)Online publication date: 5-May-2015
    • (2013)Addressing the policy-bias of q-learning by repeating updatesProceedings of the 2013 international conference on Autonomous agents and multi-agent systems10.5555/2484920.2485085(1045-1052)Online publication date: 6-May-2013
    • (2012)A common gradient in multi-agent reinforcement learningProceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 310.5555/2343896.2344023(1393-1394)Online publication date: 4-Jun-2012
    • (2012)On measuring social intelligenceProceedings of the 5th international conference on Artificial General Intelligence10.1007/978-3-642-35506-6_14(126-135)Online publication date: 8-Dec-2012
    • (2011)Empirical and theoretical support for lenient learningThe 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 310.5555/2034396.2034440(1105-1106)Online publication date: 2-May-2011

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media