Abstract
In two-player finite-state stochastic games of partial observation on graphs, in every state of the graph, the players simultaneously choose an action, and their joint actions determine a probability distribution over the successor states. The game is played for infinitely many rounds and thus the players construct an infinite path in the graph. We consider reachability objectives where the first player tries to ensure a target state to be visited almost-surely (i.e., with probability 1) or positively (i.e., with positive probability), no matter the strategy of the second player.
We classify such games according to the information and to the power of randomization available to the players. On the basis of information, the game can be one-sided with either (a) player 1, or (b) player 2 having partial observation (and the other player has perfect observation), or two-sided with (c) both players having partial observation. On the basis of randomization, (a) the players may not be allowed to use randomization (pure strategies), or (b) they may choose a probability distribution over actions but the actual random choice is external and not visible to the player (actions invisible), or (c) they may use full randomization.
Our main results for pure strategies are as follows: (1) For one-sided games with player 2 having perfect observation we show that (in contrast to full randomized strategies) belief-based (subset-construction based) strategies are not sufficient, and we present an exponential upper bound on memory both for almost-sure and positive winning strategies; we show that the problem of deciding the existence of almost-sure and positive winning strategies for player 1 is EXPTIME-complete and present symbolic algorithms that avoid the explicit exponential construction. (2) For one-sided games with player 1 having perfect observation we show that nonelementary memory is both necessary and sufficient for both almost-sure and positive winning strategies. (3) We show that for the general (two-sided) case finite-memory strategies are sufficient for both positive and almost-sure winning, and at least nonelementary memory is required. We establish the equivalence of the almost-sure winning problems for pure strategies and for randomized strategies with actions invisible. Our equivalence result exhibit serious flaws in previous results of the literature: we show a nonelementary memory lower bound for almost-sure winning whereas an exponential upper bound was previously claimed.
- M. Abadi, L. Lamport, and P. Wolper. 1989. Realizable and unrealizable specifications of reactive systems. In Proceedings of the International Colloquium on Automata, Languages and Programming. Lecture Notes in Computer Science, vol. 372, Springer, 1--17. Google ScholarDigital Library
- R. Alur, T. A. Henzinger, and O. Kupferman. 2002. Alternating-time temporal logic. J. ACM 49, 672--713. Google ScholarDigital Library
- R. Aumann, 1995. Repeated Games with Incomplete Information. MIT Press.Google Scholar
- C. Baier, N. Bertrand, and M. Grösser. 2008. On decision problems for probabilistic Büchi automata. In Proceedings of the International Conference on Foundations of Software Science and Computation Structures. Lecture Notes in Computer Science, vol. 4962, Springer, 287--301. Google ScholarDigital Library
- C. Baier, N. Bertrand, and M. Grösser. 2009. The effect of tossing coins in omega-automata. In Proceedings of the International Conference on Concurrency Theory. Lecture Notes in Computer Science, vol. 5710, Springer, 15--29. Google ScholarDigital Library
- C. Baier and M. Grösser. 2005. Recognizing omega-regular languages with probabilistic automata. In Proceedings of the Annual IEEE Symposium on Logic in Computer Science. 137--146. Google ScholarDigital Library
- N. Bertrand, B. Genest, and H. Gimbert. 2009. Qualitative determinacy and decidability of stochastic games with signals. In Proceedings of the Annual IEEE Symposium on Logic in Computer Science. 319--328. Google ScholarDigital Library
- D. Berwanger and L. Doyen. 2008. On the power of imperfect information. In Proceedings of the Conference on Foundations of Software Technology and Theoretical Computer Science. Dagstuhl Seminar Proceedings 08004. IBFI.Google Scholar
- T. Brázdil, P. Jancar, and A. Kucera. 2010. Reachability games on extended vector addition systems with states. In Proceedings of the International Colloquium on Automata, Languages and Programming. Lecture Notes in Computer Science, vol. 6199, Springer, 478--489. Google ScholarDigital Library
- R. G. Bukharaev. 1980. Probabilistic automata. J. Math.l Sci. 13, 359--386.Google Scholar
- P. Cerný, K. Chatterjee, T. A. Henzinger, A. Radhakrishna, and R. Singh. 2011. Quantitative synthesis for concurrent programs. In Proceedings of the International Conference on Computer Aided Verification. Lecture Notes in Computer Science, vol. 6806, Springer, 243--259. Google ScholarDigital Library
- R. Chadha, A. P. Sistla, and M. Viswanathan. 2009a. On the expressiveness and complexity of randomization in finite state monitors. J. ACM 56, 1--44. Google ScholarDigital Library
- R. Chadha, A. P. Sistla, and M. Viswanathan. 2009b. Power of randomization in automata on infinite strings. In Proceedings of the International Conference on Concurrency Theory. Lecture Notes in Computer Science, vol. 5710, Springer, 229--243. Google ScholarDigital Library
- R. Chadha, A. P. Sistla, and M. Viswanathan. 2010. Model checking concurrent programs with nondeterminism and randomization. In Proceedings of the Conference on Foundations of Software Technology and Theoretical Computer Science. LIPIcs Series, vol. 8, 364--375.Google Scholar
- K. Chatterjee, M. Chmelik, and M. Tracol. 2013a. What is decidable about partially observable Markov decision processes with omega-regular objectives. In Proceedings of the International Workshop on Computer Science Logic.Google Scholar
- K. Chatterjee and L. Doyen. 2010. The complexity of partial-observation parity games. In Proceedings of theInternational Conference on Logic Programming, Artificial Intelligence and Reasoning. Lecture Notes in Computer Science, vol. 6397, Springer, 1--14. Google ScholarDigital Library
- K. Chatterjee, L. Doyen, H. Gimbert, and T. A. Henzinger. 2010a. Randomness for free. In Proceedings of the International Symposium on. Mathematical Foundations of Computer Science. Lecture Notes in Computer Science, vol. 6281, Springer, 246--257. Google ScholarDigital Library
- K. Chatterjee, L. Doyen, and T. A. Henzinger. 2010b. Qualitative analysis of partially-observable Markov decision processes. In Proceedings of the International Symposium on. Mathematical Foundations of Computer Science. Lecture Notes in Computer Science, vol. 6281, Springer, 258--269. Google ScholarDigital Library
- K. Chatterjee, L. Doyen, and T. A. Henzinger. 2013b. A survey of partial-observation stochastic parity games. Formal Methods Syst. Design 43, 2, 268--284. Google ScholarDigital Library
- K. Chatterjee, L. Doyen, T. A. Henzinger, and J.-F. Raskin. 2007. Algorithms for omega-regular games of incomplete information. Logical Methods Computer Sci. 3, 3:4.Google Scholar
- K. Chatterjee, L. Doyen, S. Nain, and M. Y. Vardi. 2013c. The complexity of partial-observation stochastic parity games with finite-memory strategies. Tech. rep., IST Austria. IST-2013-141.Google Scholar
- K. Chatterjee and M. Tracol. 2012. Decidable problems for probabilistic automata on infinite words. In Proceedings of the Annual IEEE Symposium on Logic in Computer Science. 185--194. Google ScholarDigital Library
- A. Condon. 1992. The complexity of stochastic games. Inform. Comput. 96, 2, 203--224. Google ScholarDigital Library
- L. de Alfaro and T. A. Henzinger. 2001. Interface automata. In Proceedings of the ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM Press, 109--120. Google ScholarDigital Library
- L. de Alfaro, T. A. Henzinger, and O. Kupferman. 2007. Concurrent reachability games. Theor. Comput. Sci. 386, 3, 188--217. Google ScholarDigital Library
- M. De Wulf, L. Doyen, and J.-F. Raskin. 2006. A lattice theory for solving games of imperfect information. In Proceedings of the International Conference on Hybrid Systems: Computation and Control. Lecture Notes in Computer Science, vol. 3927, Springer, 153--168. Google ScholarDigital Library
- D. L. Dill. 1989. Trace Theory for Automatic Hierarchical Verification of Speed-independent Circuits. MIT Press. Google ScholarDigital Library
- R. Dimitrova and B. Finkbeiner 2008. Abstraction refinement for games with incomplete information. In Proceedings of the Conference on Foundations of Software Technology and Theoretical Computer Science. LIPIcs Series, vol. 2, 175--186.Google Scholar
- L. Doyen and J.-F. Raskin. 2010. Antichains algorithms for finite automata. In Proceedings of the Workshop on Tools and Algorithms for the Construction and Analysis of Systems. Lecture Notes in Computer Science, vol. 6015, Springer, 2--22. Google ScholarDigital Library
- E. A. Emerson and C. Jutla. 1991. Tree automata, mu-calculus and determinacy. In Proceedings of the Annual Symposium on Foundations of Computer Science. 368--377. Google ScholarDigital Library
- K. Etessami and M. Yannakakis. 2006. Recursive concurrent stochastic games. In Proceedings of the International Colloquium on Automata, Languages and Programming. Springer. Google ScholarDigital Library
- N. Fijalkow, H. Gimbert, and Y. Oualhadj. 2012. Deciding the value 1 problem for probabilistic leaktight automata. In Proceedings of the Annual IEEE Symposium on Logic in Computer Science. 295--304. Google ScholarDigital Library
- A. Finkel and J. Goubault-Larrecq. 2012. Forward analysis for WSTS, part II: Complete WSTS. Logical Methods Comput. Sci. 8, 3.Google Scholar
- H. Gimbert and Y. Oualhadj. 2010. Probabilistic automata on finite words: Decidable and undecidable problems. In Proceedings of the International Colloquium on Automata, Languages and Programming. Lecture Notes in Computer Science, vol. 6199, Springer, 527--538. Google ScholarDigital Library
- V. Gripon and O. Serre. 2009. Qualitative concurrent stochastic games with imperfect information. In Proceedings of the International Colloquium on Automata, Languages and Programming. Lecture Notes in Computer Science, vol. 5556, Springer, 200--211. Google ScholarDigital Library
- T. A. Henzinger and P. Kopke. 1999. Discrete-time control for rectangular hybrid automata. Theor. Comp. Sci. 221, 369--392. Google ScholarDigital Library
- W. Jamroga, S. Mauw, and M. Melissen. 2012. Fairness in non-repudiation protocols. In Security and Trust Management. Lecture Notes in Computer Science, vol. 7170, Springer, 122--139. Google ScholarDigital Library
- A. Kechris. 1995. Classical Descriptive Set Theory. Springer.Google Scholar
- D. König. 1936. Theorie der endlichen und unendlichen Graphen. Akademische Verlagsgesellschaft, Leipzig.Google Scholar
- H. Kress-Gazit, G. E. Fainekos, and G. J. Pappas. 2009. Temporal-logic-based reactive mission and motion planning. IEEE Trans. Rob. 25, 6, 1370--1381. Google ScholarDigital Library
- O. Kupferman and M. Y. Vardi. 2000. Synthesis with incomplete informatio. In Advances in Temporal Logic, Kluwer Academic Publishers, 109--127.Google Scholar
- S. Nain and M. Y. Vardi 2013. Solving partial-information stochastic parity games. In Proceedings of the Annual IEEE Symposium on Logic in Computer Science. 341--348. Google ScholarDigital Library
- C. H. Papadimitriou and J. N. Tsitsiklis. 1987. The complexity of Markov decision processes. Math. Oper. Rese. 12, 441--450. Google ScholarDigital Library
- A. Paz. 1971. Introduction to Probabilistic Automata. Academic Press, Inc. Orlando, FL. Google ScholarDigital Library
- A. Pnueli and R. Rosner. 1989. On the synthesis of a reactive module. In Proceedings of the ACM Symposium on Principles of Programming Languages. ACM Press, 179--190. Google ScholarDigital Library
- M. O. Rabin. 1963. Probabilistic automata. Inform. Control 6, 230--245.Google ScholarCross Ref
- C. Rackoff. 1978. The covering and boundedness problems for vector addition systems. Theor. Comput. Sci. 6, 223--231.Google ScholarCross Ref
- P. J. Ramadge and W. M. Wonham. 1987. Supervisory control of a class of discrete-event processes. SIAM J. Control Optim. 25, 1, 206--230. Google ScholarDigital Library
- J. H. Reif. 1979. Universal games of incomplete information. In Proceedings of the Annual ACM Symposium on Theory of Computing. ACM, 288--308. Google ScholarDigital Library
- J. H. Reif. 1984. The complexity of two-player games of incomplete information. J. Comput. Syst. Sci. 29, 274--301.Google ScholarCross Ref
- J. H. Reif and G. L. Peterson. 1980. A dynamic logic of multiprocessing with incomplete information. In Proceedings of the ACM Symposium on Principles of Programming Languages. ACM, 193--202. Google ScholarDigital Library
- J. Renault. 2012. The value of repeated games with an informed controller. Math. Oper. Res. 37, 1, 154--179. Google ScholarDigital Library
- D. Rosenberg, E. Solan, and N. Vieille. 2003. The maxmin value of stochastic games with imperfect monitoring. Int. J. Game Theory 32, 1, 133--150.Google ScholarCross Ref
- L. E. Rosier and H.-C. Yen, 1986. A multiparameter analysis of the boundedness problem for vector addition systems. J. Comput. Syst. Sci. 32, 1, 105--135. Google ScholarDigital Library
- L. S. Shapley. 1953. Stochastic games. Proc. Nat. Acad. Sci. U.S.A. 39, 1095--1100.Google ScholarCross Ref
- S. Sorin. 2002. A First Course in Zero-Sum Repeated Games. Springer.Google Scholar
- W. Thomas. 1997. Languages, automata, and logic. In Handbook of Formal Languages, Vol. 3, Beyond Words. Springer, Chapter 7, 389--455. Google ScholarDigital Library
- M. Tracol, C. Baier, and M. Grösser. 2009. Recurrence and transience for probabilistic automata. In Proceedings of the Conference on Foundations of Software Technology and Theoretical Computer Science. LIPIcs Series, vol. 4. 395--406.Google Scholar
- M. Y. Vardi. 1985. Automatic verification of probabilistic concurrent finite-state systems. In Proceedings of the Annual Symposium on Foundations of Computer Science. 327--338. Google ScholarDigital Library
Index Terms
Partial-Observation Stochastic Games: How to Win when Belief Fails
Recommendations
Qualitative Determinacy and Decidability of Stochastic Games with Signals
We consider two-person zero-sum stochastic games with signals, a standard model of stochastic games with imperfect information. The only source of information for the players consists of the signals they receive; they cannot directly observe the state ...
Partial-Observation Stochastic Games: How to Win When Belief Fails
LICS '12: Proceedings of the 2012 27th Annual IEEE/ACM Symposium on Logic in Computer ScienceWe consider two-player stochastic games played on finite graphs with reachability objectives where the first player tries to ensure a target state to be visited almost-surely (i.e., with probability 1), or positively (i.e., with positive probability), ...
A survey of partial-observation stochastic parity games
We consider two-player zero-sum stochastic games on graphs with -regular winning conditions specified as parity objectives. These games have applications in the design and control of reactive systems. We survey the complexity results for the problem of ...
Comments