We prove two determinacy and decidability results about two-players stochastic reachability games with partial observation on both sides and finitely many states, signals and actions.
We prove that the determinacy of Gale-Stewart games whose winning sets are accepted by real-time 1-counter Buchi automata is equivalent to the determinacy of (effective) analytic Gale-Stewart games which is known to be a large cardinal assumption. We
show also that the determinacy of Wadge games between two players in charge of omega-languages accepted by 1-counter Buchi automata is equivalent to the (effective) analytic Wadge determinacy. Using some results of set theory we prove that one can effectively construct a 1-counter Buchi automaton A and a Buchi automaton B such that: (1) There exists a model of ZFC in which Player 2 has a winning strategy in the Wadge game W(L(A), L(B)); (2) There exists a model of ZFC in which the Wadge game W(L(A), L(B)) is not determined. Moreover these are the only two possibilities, i.e. there are no models of ZFC in which Player 1 has a winning strategy in the Wadge game W(L(A), L(B)).
We study games with reachability objectives under energy constraints. We first prove that under strict energy constraints (either only lower-bound constraint or interval constraint), those games are LOGSPACE-equivalent to energy games with the same e
nergy constraints but without reachability objective (i.e., for infinite runs). We then consider two kinds of relaxations of the upper-bound constraints (while keeping the lower-bound constraint strict): in the first one, called weak upper bound, the upper bound is absorbing, in the sense that it allows receiving more energy when the upper bound is already reached, but the extra energy will not be stored; in the second one, we allow for temporary violations of the upper bound, imposing limits on the number or on the amount of violations. We prove that when considering weak upper bound, reachability objectives require memory, but can still be solved in polynomial-time for one-player arenas; we prove that they are in co-NP in the two-player setting. Allowing for bounded violations makes the problem PSPACE-complete for one-player arenas and EXPTIME-complete for two players.
We study turn-based stochastic zero-sum games with lexicographic preferences over reachability and safety objectives. Stochastic games are standard models in control, verification, and synthesis of stochastic reactive systems that exhibit both random
ness as well as angelic and demonic non-determinism. Lexicographic order allows to consider multiple objectives with a strict preference order over the satisfaction of the objectives. To the best of our knowledge, stochastic games with lexicographic objectives have not been studied before. We establish determinacy of such games and present strategy and computational complexity results. For strategy complexity, we show that lexicographically optimal strategies exist that are deterministic and memory is only required to remember the already satisfied and violated objectives. For a constant number of objectives, we show that the relevant decision problem is in NP $cap$ coNP, matching the current known bound for single objectives; and in general the decision problem is PSPACE-hard and can be solved in NEXPTIME $cap$ coNEXPTIME. We present an algorithm that computes the lexicographically optimal strategies via a reduction to computation of optimal strategies in a sequence of single-objectives games. We have implemented our algorithm and report experimental results on various case studies.
We consider an example of stochastic games with partial, asymmetric and non-classical information. We obtain relevant equilibrium policies using a new approach which allows managing the belief updates in a structured manner. Agents have access only t
o partial information updates, and our approach is to consider optimal open loop control until the information update. The agents continuously control the rates of their Poisson search clocks to acquire the locks, the agent to get all the locks before others would get reward one. However, the agents have no information about the acquisition status of others and will incur a cost proportional to their rate process. We solved the problem for the case with two agents and two locks and conjectured the results for $N$-agents. We showed that a pair of (partial) state-dependent time-threshold policies form a Nash equilibrium.
Two-player, turn-based, stochastic games with reachability conditions are considered, where the maximizer has no information (he is blind) and is restricted to deterministic strategies whereas the minimizer is perfectly informed. We ask the question
of whether the game has maxmin 1, in other words we ask whether for all $epsilon>0$ there exists a deterministic strategy for the (blind) maximizer such that against all the strategies of the minimizer, it is possible to reach the set of final states with probability larger than $1-epsilon$. This problem is undecidable in general, but we define a class of games, called leaktight half-blind games where the problem becomes decidable. We also show that mixed strategies in general are stronger for both players and that optimal strategies for the minimizer might require infinite-memory.
Nathalie Bertrand
,Blaise Genest
,Hugo Gimbert
.
(2008)
.
"Determinacy and Decidability of Reachability Games with Partial Observation on Both Sides"
.
Hugo Gimbert
هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا