No Arabic abstract
Quite some work in the ATL-tradition uses the differences between various types of strategies (positional, uniform, perfect recall) to give alternative semantics to the same logical language. This paper contributes to another perspective on strategy types, one where we characterise the differences between them on the syntactic (object language) level. This is important for a more traditional knowledge representation view on strategic content. Leaving differences between strategy types implicit in the semantics is a sensible idea if the goal is to use the strategic formalism for model checking. But, for traditional knowledge representation in terms of object language level formulas, we need to extent the language. This paper introduces a strategic STIT syntax with explicit operators for knowledge that allows us to charaterise strategy types. This more expressive strategic language is interpreted on standard ATL-type concurrent epistemic game structures. We introduce rule-based strategies in our language and fruitfully apply them to the representation and characterisation of positional and uniform strategies. Our representations highlight crucial conditions to be met for strategy types. We demonstrate the usefulness of our work by showing that it leads to a critical reexamination of coalitional uniform strategies.
As a contribution to the challenge of building game-playing AI systems, we develop and analyse a formal language for representing and reasoning about strategies. Our logical language builds on the existing general Game Description Language (GDL) and extends it by a standard modality for linear time along with two dual connectives to express preferences when combining strategies. The semantics of the language is provided by a standard state-transition model. As such, problems that require reasoning about games can be solved by the standard methods for reasoning about actions and change. We also endow the language with a specific semantics by which strategy formulas are understood as move recommendations for a player. To illustrate how our formalism supports automated reasoning about strategies, we demonstrate two example methods of implementation/: first, we formalise the semantic interpretation of our language in conjunction with game rules and strategy rules in the Situation Calculus; second, we show how the reasoning problem can be solved with Answer Set Programming.
In asynchronous games, Melli{`e}s proved that innocent strategies are positional: their behaviour only depends on the position, not the temporal order used to reach it. This insightful result shaped our understanding of the link between dynamic (i.e. game) and static (i.e. relational) semantics. In this paper, we investigate the positionality of innocent strategies in the traditional setting of Hyland-Ong-Nickau-Coquand pointer games. We show that though innocent strategies are not positional, total finite innocent strategies still enjoy a key consequence of positionality, namely positional injectivity: they are entirely determined by their positions. Unfortunately, this does not hold in general: we show a counterexample if finiteness and totality are lifted. For finite partial strategies we leave the problem open; we show however the partial result that two strategies with the same positions must have the same P-views of maximal length.
Markov Decision Processes (MDPs) are a popular class of models suitable for solving control decision problems in probabilistic reactive systems. We consider parametric MDPs (pMDPs) that include parameters in some of the transition probabilities to account for stochastic uncertainties of the environment such as noise or input disturbances. We study pMDPs with reachability objectives where the parameter values are unknown and impossible to measure directly during execution, but there is a probability distribution known over the parameter values. We study for the first time computing parameter-independent strategies that are expectation optimal, i.e., optimize the expected reachability probability under the probability distribution over the parameters. We present an encoding of our problem to partially observable MDPs (POMDPs), i.e., a reduction of our problem to computing optimal strategies in POMDPs. We evaluate our method experimentally on several benchmarks: a motivating (repeated) learner model; a series of benchmarks of varying configurations of a robot moving on a grid; and a consensus protocol.
Concurrent strategies based on event structures are examined from the viewpoint of may and must testing in traditional process calculi. In their pure form concurrent strategies fail to expose the deadlocks and divergences that can arise in their composition. This motivates an extension of the bicategory of concurrent strategies to treat the may and must behaviour of strategies under testing. One extension adjoins neutral moves to strategies but in so doing loses identities w.r.t. composition. This in turn motivates another extension in which concurrent strategies are accompanied by stopping configurations; the ensuing stopping strategies inherit the structure of a bicategory from that of strategies. The technical developments converge in providing characterisations of the may and must equivalences and preorders on strategies.
We consider the verification of multiple expected reward objectives at once on Markov decision processes (MDPs). This enables a trade-off analysis among multiple objectives by obtaining the Pareto front. We focus on strategies that are easy to employ and implement. That is, strategies that are pure (no randomization) and have bounded memory. We show that checking whether a point is achievable by a pure stationary strategy is NP-complete, even for two objectives, and we provide an MILP encoding to solve the corresponding problem. The bounded memory case can be reduced to the stationary one by a product construction. Experimental results using Storm and Gurobi show the feasibility of our algorithms.