skip to main content
research-article

Partial-Observation Stochastic Games: How to Win when Belief Fails

Published:02 May 2014Publication History
Skip Abstract Section

Abstract

In two-player finite-state stochastic games of partial observation on graphs, in every state of the graph, the players simultaneously choose an action, and their joint actions determine a probability distribution over the successor states. The game is played for infinitely many rounds and thus the players construct an infinite path in the graph. We consider reachability objectives where the first player tries to ensure a target state to be visited almost-surely (i.e., with probability 1) or positively (i.e., with positive probability), no matter the strategy of the second player.

We classify such games according to the information and to the power of randomization available to the players. On the basis of information, the game can be one-sided with either (a) player 1, or (b) player 2 having partial observation (and the other player has perfect observation), or two-sided with (c) both players having partial observation. On the basis of randomization, (a) the players may not be allowed to use randomization (pure strategies), or (b) they may choose a probability distribution over actions but the actual random choice is external and not visible to the player (actions invisible), or (c) they may use full randomization.

Our main results for pure strategies are as follows: (1) For one-sided games with player 2 having perfect observation we show that (in contrast to full randomized strategies) belief-based (subset-construction based) strategies are not sufficient, and we present an exponential upper bound on memory both for almost-sure and positive winning strategies; we show that the problem of deciding the existence of almost-sure and positive winning strategies for player 1 is EXPTIME-complete and present symbolic algorithms that avoid the explicit exponential construction. (2) For one-sided games with player 1 having perfect observation we show that nonelementary memory is both necessary and sufficient for both almost-sure and positive winning strategies. (3) We show that for the general (two-sided) case finite-memory strategies are sufficient for both positive and almost-sure winning, and at least nonelementary memory is required. We establish the equivalence of the almost-sure winning problems for pure strategies and for randomized strategies with actions invisible. Our equivalence result exhibit serious flaws in previous results of the literature: we show a nonelementary memory lower bound for almost-sure winning whereas an exponential upper bound was previously claimed.

References

  1. M. Abadi, L. Lamport, and P. Wolper. 1989. Realizable and unrealizable specifications of reactive systems. In Proceedings of the International Colloquium on Automata, Languages and Programming. Lecture Notes in Computer Science, vol. 372, Springer, 1--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Alur, T. A. Henzinger, and O. Kupferman. 2002. Alternating-time temporal logic. J. ACM 49, 672--713. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Aumann, 1995. Repeated Games with Incomplete Information. MIT Press.Google ScholarGoogle Scholar
  4. C. Baier, N. Bertrand, and M. Grösser. 2008. On decision problems for probabilistic Büchi automata. In Proceedings of the International Conference on Foundations of Software Science and Computation Structures. Lecture Notes in Computer Science, vol. 4962, Springer, 287--301. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. Baier, N. Bertrand, and M. Grösser. 2009. The effect of tossing coins in omega-automata. In Proceedings of the International Conference on Concurrency Theory. Lecture Notes in Computer Science, vol. 5710, Springer, 15--29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Baier and M. Grösser. 2005. Recognizing omega-regular languages with probabilistic automata. In Proceedings of the Annual IEEE Symposium on Logic in Computer Science. 137--146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. N. Bertrand, B. Genest, and H. Gimbert. 2009. Qualitative determinacy and decidability of stochastic games with signals. In Proceedings of the Annual IEEE Symposium on Logic in Computer Science. 319--328. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. Berwanger and L. Doyen. 2008. On the power of imperfect information. In Proceedings of the Conference on Foundations of Software Technology and Theoretical Computer Science. Dagstuhl Seminar Proceedings 08004. IBFI.Google ScholarGoogle Scholar
  9. T. Brázdil, P. Jancar, and A. Kucera. 2010. Reachability games on extended vector addition systems with states. In Proceedings of the International Colloquium on Automata, Languages and Programming. Lecture Notes in Computer Science, vol. 6199, Springer, 478--489. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. G. Bukharaev. 1980. Probabilistic automata. J. Math.l Sci. 13, 359--386.Google ScholarGoogle Scholar
  11. P. Cerný, K. Chatterjee, T. A. Henzinger, A. Radhakrishna, and R. Singh. 2011. Quantitative synthesis for concurrent programs. In Proceedings of the International Conference on Computer Aided Verification. Lecture Notes in Computer Science, vol. 6806, Springer, 243--259. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. Chadha, A. P. Sistla, and M. Viswanathan. 2009a. On the expressiveness and complexity of randomization in finite state monitors. J. ACM 56, 1--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Chadha, A. P. Sistla, and M. Viswanathan. 2009b. Power of randomization in automata on infinite strings. In Proceedings of the International Conference on Concurrency Theory. Lecture Notes in Computer Science, vol. 5710, Springer, 229--243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Chadha, A. P. Sistla, and M. Viswanathan. 2010. Model checking concurrent programs with nondeterminism and randomization. In Proceedings of the Conference on Foundations of Software Technology and Theoretical Computer Science. LIPIcs Series, vol. 8, 364--375.Google ScholarGoogle Scholar
  15. K. Chatterjee, M. Chmelik, and M. Tracol. 2013a. What is decidable about partially observable Markov decision processes with omega-regular objectives. In Proceedings of the International Workshop on Computer Science Logic.Google ScholarGoogle Scholar
  16. K. Chatterjee and L. Doyen. 2010. The complexity of partial-observation parity games. In Proceedings of theInternational Conference on Logic Programming, Artificial Intelligence and Reasoning. Lecture Notes in Computer Science, vol. 6397, Springer, 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. K. Chatterjee, L. Doyen, H. Gimbert, and T. A. Henzinger. 2010a. Randomness for free. In Proceedings of the International Symposium on. Mathematical Foundations of Computer Science. Lecture Notes in Computer Science, vol. 6281, Springer, 246--257. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. K. Chatterjee, L. Doyen, and T. A. Henzinger. 2010b. Qualitative analysis of partially-observable Markov decision processes. In Proceedings of the International Symposium on. Mathematical Foundations of Computer Science. Lecture Notes in Computer Science, vol. 6281, Springer, 258--269. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. K. Chatterjee, L. Doyen, and T. A. Henzinger. 2013b. A survey of partial-observation stochastic parity games. Formal Methods Syst. Design 43, 2, 268--284. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. K. Chatterjee, L. Doyen, T. A. Henzinger, and J.-F. Raskin. 2007. Algorithms for omega-regular games of incomplete information. Logical Methods Computer Sci. 3, 3:4.Google ScholarGoogle Scholar
  21. K. Chatterjee, L. Doyen, S. Nain, and M. Y. Vardi. 2013c. The complexity of partial-observation stochastic parity games with finite-memory strategies. Tech. rep., IST Austria. IST-2013-141.Google ScholarGoogle Scholar
  22. K. Chatterjee and M. Tracol. 2012. Decidable problems for probabilistic automata on infinite words. In Proceedings of the Annual IEEE Symposium on Logic in Computer Science. 185--194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. Condon. 1992. The complexity of stochastic games. Inform. Comput. 96, 2, 203--224. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. L. de Alfaro and T. A. Henzinger. 2001. Interface automata. In Proceedings of the ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM Press, 109--120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. L. de Alfaro, T. A. Henzinger, and O. Kupferman. 2007. Concurrent reachability games. Theor. Comput. Sci. 386, 3, 188--217. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. De Wulf, L. Doyen, and J.-F. Raskin. 2006. A lattice theory for solving games of imperfect information. In Proceedings of the International Conference on Hybrid Systems: Computation and Control. Lecture Notes in Computer Science, vol. 3927, Springer, 153--168. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. D. L. Dill. 1989. Trace Theory for Automatic Hierarchical Verification of Speed-independent Circuits. MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. R. Dimitrova and B. Finkbeiner 2008. Abstraction refinement for games with incomplete information. In Proceedings of the Conference on Foundations of Software Technology and Theoretical Computer Science. LIPIcs Series, vol. 2, 175--186.Google ScholarGoogle Scholar
  29. L. Doyen and J.-F. Raskin. 2010. Antichains algorithms for finite automata. In Proceedings of the Workshop on Tools and Algorithms for the Construction and Analysis of Systems. Lecture Notes in Computer Science, vol. 6015, Springer, 2--22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. E. A. Emerson and C. Jutla. 1991. Tree automata, mu-calculus and determinacy. In Proceedings of the Annual Symposium on Foundations of Computer Science. 368--377. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. K. Etessami and M. Yannakakis. 2006. Recursive concurrent stochastic games. In Proceedings of the International Colloquium on Automata, Languages and Programming. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. N. Fijalkow, H. Gimbert, and Y. Oualhadj. 2012. Deciding the value 1 problem for probabilistic leaktight automata. In Proceedings of the Annual IEEE Symposium on Logic in Computer Science. 295--304. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. A. Finkel and J. Goubault-Larrecq. 2012. Forward analysis for WSTS, part II: Complete WSTS. Logical Methods Comput. Sci. 8, 3.Google ScholarGoogle Scholar
  34. H. Gimbert and Y. Oualhadj. 2010. Probabilistic automata on finite words: Decidable and undecidable problems. In Proceedings of the International Colloquium on Automata, Languages and Programming. Lecture Notes in Computer Science, vol. 6199, Springer, 527--538. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. V. Gripon and O. Serre. 2009. Qualitative concurrent stochastic games with imperfect information. In Proceedings of the International Colloquium on Automata, Languages and Programming. Lecture Notes in Computer Science, vol. 5556, Springer, 200--211. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. T. A. Henzinger and P. Kopke. 1999. Discrete-time control for rectangular hybrid automata. Theor. Comp. Sci. 221, 369--392. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. W. Jamroga, S. Mauw, and M. Melissen. 2012. Fairness in non-repudiation protocols. In Security and Trust Management. Lecture Notes in Computer Science, vol. 7170, Springer, 122--139. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. A. Kechris. 1995. Classical Descriptive Set Theory. Springer.Google ScholarGoogle Scholar
  39. D. König. 1936. Theorie der endlichen und unendlichen Graphen. Akademische Verlagsgesellschaft, Leipzig.Google ScholarGoogle Scholar
  40. H. Kress-Gazit, G. E. Fainekos, and G. J. Pappas. 2009. Temporal-logic-based reactive mission and motion planning. IEEE Trans. Rob. 25, 6, 1370--1381. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. O. Kupferman and M. Y. Vardi. 2000. Synthesis with incomplete informatio. In Advances in Temporal Logic, Kluwer Academic Publishers, 109--127.Google ScholarGoogle Scholar
  42. S. Nain and M. Y. Vardi 2013. Solving partial-information stochastic parity games. In Proceedings of the Annual IEEE Symposium on Logic in Computer Science. 341--348. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. C. H. Papadimitriou and J. N. Tsitsiklis. 1987. The complexity of Markov decision processes. Math. Oper. Rese. 12, 441--450. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. A. Paz. 1971. Introduction to Probabilistic Automata. Academic Press, Inc. Orlando, FL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. A. Pnueli and R. Rosner. 1989. On the synthesis of a reactive module. In Proceedings of the ACM Symposium on Principles of Programming Languages. ACM Press, 179--190. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. M. O. Rabin. 1963. Probabilistic automata. Inform. Control 6, 230--245.Google ScholarGoogle ScholarCross RefCross Ref
  47. C. Rackoff. 1978. The covering and boundedness problems for vector addition systems. Theor. Comput. Sci. 6, 223--231.Google ScholarGoogle ScholarCross RefCross Ref
  48. P. J. Ramadge and W. M. Wonham. 1987. Supervisory control of a class of discrete-event processes. SIAM J. Control Optim. 25, 1, 206--230. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. J. H. Reif. 1979. Universal games of incomplete information. In Proceedings of the Annual ACM Symposium on Theory of Computing. ACM, 288--308. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. J. H. Reif. 1984. The complexity of two-player games of incomplete information. J. Comput. Syst. Sci. 29, 274--301.Google ScholarGoogle ScholarCross RefCross Ref
  51. J. H. Reif and G. L. Peterson. 1980. A dynamic logic of multiprocessing with incomplete information. In Proceedings of the ACM Symposium on Principles of Programming Languages. ACM, 193--202. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. J. Renault. 2012. The value of repeated games with an informed controller. Math. Oper. Res. 37, 1, 154--179. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. D. Rosenberg, E. Solan, and N. Vieille. 2003. The maxmin value of stochastic games with imperfect monitoring. Int. J. Game Theory 32, 1, 133--150.Google ScholarGoogle ScholarCross RefCross Ref
  54. L. E. Rosier and H.-C. Yen, 1986. A multiparameter analysis of the boundedness problem for vector addition systems. J. Comput. Syst. Sci. 32, 1, 105--135. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. L. S. Shapley. 1953. Stochastic games. Proc. Nat. Acad. Sci. U.S.A. 39, 1095--1100.Google ScholarGoogle ScholarCross RefCross Ref
  56. S. Sorin. 2002. A First Course in Zero-Sum Repeated Games. Springer.Google ScholarGoogle Scholar
  57. W. Thomas. 1997. Languages, automata, and logic. In Handbook of Formal Languages, Vol. 3, Beyond Words. Springer, Chapter 7, 389--455. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. M. Tracol, C. Baier, and M. Grösser. 2009. Recurrence and transience for probabilistic automata. In Proceedings of the Conference on Foundations of Software Technology and Theoretical Computer Science. LIPIcs Series, vol. 4. 395--406.Google ScholarGoogle Scholar
  59. M. Y. Vardi. 1985. Automatic verification of probabilistic concurrent finite-state systems. In Proceedings of the Annual Symposium on Foundations of Computer Science. 327--338. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Partial-Observation Stochastic Games: How to Win when Belief Fails

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Computational Logic
        ACM Transactions on Computational Logic  Volume 15, Issue 2
        April 2014
        257 pages
        ISSN:1529-3785
        EISSN:1557-945X
        DOI:10.1145/2616911
        Issue’s Table of Contents

        Copyright © 2014 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 2 May 2014
        • Accepted: 1 November 2013
        • Received: 1 July 2013
        Published in tocl Volume 15, Issue 2

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader