skip to main content
10.1145/1160633.1160816acmconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
Article

Exact solutions of interactive POMDPs using behavioral equivalence

Published:08 May 2006Publication History

ABSTRACT

We present a method for transforming the infinite interactive state space of interactive POMDPs (I-POMDPs) into a finite one, thereby enabling the computation of exact solutions. I-POMDPs allow sequential decision making in multi-agent environments by modeling other agents' beliefs, capabilities, and preferences as part of the interactive state space. Since beliefs are allowed to be arbitrarily nested and are continuous, it is not possible to compute optimal solutions using value iteration as in POMDPs. We present a method that transforms the original state space into a finite one by grouping the other agents' behaviorally equivalent models into equivalence classes. This enables us to compute the complete optimal solution for the I-POMDP, which may be represented as a policy graph. We illustrate our method using the multi-agent Tiger problem and discuss features of the solution.

References

  1. R. Becker, S. Zilberstein, and V. Lesser. Decentralized markov decision processes with event-driven interactions. In Proceedings of the Third International Joint Conference on Autonomous Agents and Multi Agent Systems (AAMAS-04), 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. D. S. Bernstein, R. Givan, N. Immerman, and S. Zilberstein. The complexity of decentralized control of markov decision processes. In Mathematics of Operations Research, volume 27, No. 4, pages 819--840, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. S. Bernstein, E. A. Hansen, and S. Zilberstein. Bounded policy iteration for decentralized pomdps. In Proceedings of the The Nineteenth International Joint Conference on Artificial Intelligence (IJCAI-05), 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. R. Cassandra, M. L. Littman, and N. L. Zhang. Incremental pruning: A simple, fast, exact method for partially observable markov decision processes. In Uncertainty in Artificial Intelligence, Rhode Island, Providence, 1997.Google ScholarGoogle Scholar
  5. P. Doshi and P. Gmytrasiewicz. Approximating state estimation in multiagent settings using particle filters. In Proceedings of the Fourth International Conference on Autonomous Agents and Multi Agent Systems (AAMAS-05), 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. Doshi and P. Gmytrasiewicz. A particle filtering based approach to approximating interactive pomdps. In Proceedings of the The Twentieth National Conference on Artificial Intelligence (AAAI-05), 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. Givan, T. Dean, and M. Greig. Equivalence notions and model minimization in markov decision processes. In Artificial Intelligence Journal, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. P. J. Gmytrasiewicz and P. Doshi. A framework for sequential planning in multi-agent settings. In Journal of AI Research, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. E. A. Hansen, D. S. Bernstein, and S. Zilberstein. Dynamic programming for partially observable stochastic games. In Proceedings of the 19th National Conference on Artificial Intelligence (AAAI-04), 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. C. Harsanyi. Games with incomplete information played by 'bayesian' players. In Management Science, pages 14(3):159--182, Nov 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. L. Kaelbling, M. Littman, and A. Cassandra. Planning and acting in partially observable stochastic domains. Artificial Intelligence, 2, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. Nair, M. Tambe, M. Yokoo, D. Pynadath, and S. Marsella. Taming decentralized pomdps: Towards efficient policy computation for multiagent settings. In Proceedings of International Joint Conference in Artificial Intelligence (IJCAI), 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. M. Porta, M. T. Spaan, and N. Vlassis. Value iteration for continuous-state pomdps. In IAS Technical report IAS-UVA-04-04, 2004.Google ScholarGoogle Scholar
  14. P. Poupart and C. Boutilier. Value directed compression of pomdps. In Seventeenth Annual Conference on Neural Information Processing Systems, 2003.Google ScholarGoogle Scholar
  15. D. V. Pynadath and M. Tambe. Multiagent teamwork: Analyzing the optimality and complexity of key theories and models. In Joint Conference on Autonomous Agents and Multi-Agent Systems, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. Smallwood and E. Sondik. The optimal control of partially observable markov decision processes over a finite horizon. Operations Research, 21:1071--1088, 1973.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Thrun. Monte carlo POMDPs. In S. Solla, T. Leen, and K.-R. Müller, editors, Advances in Neural Information Processing Systems 12, pages 1064--1070. MIT Press, 2000.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    AAMAS '06: Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
    May 2006
    1631 pages
    ISBN:1595933034
    DOI:10.1145/1160633

    Copyright © 2006 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 8 May 2006

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • Article

    Acceptance Rates

    Overall Acceptance Rate1,155of5,036submissions,23%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader