New community

Subscribe to the gold package and get unlimited access to Shamra Academy

The Complexity of Synchronizing Markov Decision Processes

204 0 0.0 ( 0 )

Download Cite

Added by Mahsa Shirmohammadi

Publication date 2016

fields Informatics Engineering

and research's language is English

Authors Laurent Doyen - Thierry Massart - Mahsa Shirmohammadi

Formal Languages and Automata Theory Logic in Computer Science

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We consider Markov decision processes (MDP) as generators of sequences of probability distributions over states. A probability distribution is p-synchronizing if the probability mass is at least p in a single state, or in a given set of states. We consider four temporal synchronizing modes: a sequence of probability distributions is always p-synchronizing, eventually p-synchronizing, weakly p-synchronizing, or strongly p-synchronizing if, respectively, all, some, infinitely many, or all but finitely many distributions in the sequence are p-synchronizing. For each synchronizing mode, an MDP can be (i) sure winning if there is a strategy that produces a 1-synchronizing sequence; (ii) almost-sure winning if there is a strategy that produces a sequence that is, for all epsilon > 0, a (1-epsilon)-synchronizing sequence; (iii) limit-sure winning if for all epsilon > 0, there is a strategy that produces a (1-epsilon)-synchronizing sequence. We provide fundamental results on the expressiveness, decidability, and complexity of synchronizing properties for MDPs. For each synchronizing mode, we consider the problem of deciding whether an MDP is sure, almost-sure, or limit-sure winning, and we establish matching upper and lower complexity bounds of the problems: for all winning modes, we show that the problems are PSPACE-complete for eventually and weakly synchronizing, and PTIME-complete for always and strongly synchronizing. We establish the memory requirement for winning strategies, and we show that all winning modes coincide for always synchronizing, and that the almost-sure and limit-sure winning modes coincide for weakly and strongly synchronizing.

rate research

Synchronizing Objectives for Markov Decision Processes

133 - Laurent Doyen 2011

We introduce synchronizing objectives for Markov decision processes (MDP). Intuitively, a synchronizing objective requires that eventually, at every step there is a state which concentrates almost all the probability mass. In particular, it implies that the probabilistic system behaves in the long run like a deterministic system: eventually, the current state of the MDP can be identified with almost certainty. We study the problem of deciding the existence of a strategy to enforce a synchronizing objective in MDPs. We show that the problem is decidable for general strategies, as well as for blind strategies where the player cannot observe the current state of the MDP. We also show that pure strategies are sufficient, but memory may be necessary.

Logic in Computer Science Computational Complexity

A Faster-Than Relation for Semi-Markov Decision Processes

60 - Mathias Ruggaard Pedersen 2018

When modeling concurrent or cyber-physical systems, non-functional requirements such as time are important to consider. In order to improve the timing aspects of a model, it is necessary to have some notion of what it means for a process to be faster than another, which can guide the stepwise refinement of the model. To this end we study a faster-than relation for semi-Markov decision processes and compare it to standard notions for relating systems. We consider the compositional aspects of this relation, and show that the faster-than relation is not a precongruence with respect to parallel composition, hence giving rise to so-called parallel timing anomalies. We take the first steps toward understanding this problem by identifying decidable conditions sufficient to avoid parallel timing anomalies in the absence of non-determinism.

Formal Languages and Automata Theory

The Complexity of Reachability in Parametric Markov Decision Processes

240 - Sebastian Junges , Joost-Pieter Katoen , Guillermo A. Perez 2020

This article presents the complexity of reachability decision problems for parametric Markov decision processes (pMDPs), an extension to Markov decision processes (MDPs) where transitions probabilities are described by polynomials over a finite set of parameters. In particular, we study the complexity of finding values for these parameters such that the induced MDP satisfies some maximal or minimal reachability probability constraints. We discuss different variants depending on the comparison operator in the constraints and the domain of the parameter values. We improve all known lower bounds for this problem, and notably provide ETR-completeness results for distinct variants of this problem.

Logic in Computer Science

Model-free Reinforcement Learning for Branching Markov Decision Processes

127 - Ernst Moritz Hahn , Mateo Perez , Sven Schewe 2021

We study reinforcement learning for the optimal control of Branching Markov Decision Processes (BMDPs), a natural extension of (multitype) Branching Markov Chains (BMCs). The state of a (discrete-time) BMCs is a collection of entities of various types that, while spawning other entities, generate a payoff. In comparison with BMCs, where the evolution of a each entity of the same type follows the same probabilistic pattern, BMDPs allow an external controller to pick from a range of options. This permits us to study the best/worst behaviour of the system. We generalise model-free reinforcement learning techniques to compute an optimal control strategy of an unknown BMDP in the limit. We present results of an implementation that demonstrate the practicality of the approach.

Machine Learning Logic in Computer Science Systems and Control

On the Complexity of Reachability in Parametric Markov Decision Processes

308 - Tobias Winkler , Sebastian Junges , Guillermo A. Perez 2019

This paper studies parametric Markov decision processes (pMDPs), an extension to Markov decision processes (MDPs) where transitions probabilities are described by polynomials over a finite set of parameters. Fixing values for all parameters yields MDPs. In particular, this paper studies the complexity of finding values for these parameters such that the induced MDP satisfies some reachability constraints. We discuss different variants depending on the comparison operator in the constraints and the domain of the parameter values. We improve all known lower bounds for this problem, and notably provide ETR-completeness results for distinct variants of this problem. Furthermore, we provide insights in the functions describing the induced reachability probabilities, and how pMDPs generalise concurrent stochastic reachability games.

Logic in Computer Science Computational Complexity

comments

Fetching comments

International University for Science and Technology

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

The Complexity of Synchronizing Markov Decision Processes

Ask ChatGPT about the research

No Arabic abstract

Read More