بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

The Complexity of Synchronizing Markov Decision Processes

204 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Mahsa Shirmohammadi

تاريخ النشر 2016

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Laurent Doyen - Thierry Massart - Mahsa Shirmohammadi

اللغات الرسمية ونظرية الأتومات المنطق في علوم الحاسوب

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We consider Markov decision processes (MDP) as generators of sequences of probability distributions over states. A probability distribution is p-synchronizing if the probability mass is at least p in a single state, or in a given set of states. We consider four temporal synchronizing modes: a sequence of probability distributions is always p-synchronizing, eventually p-synchronizing, weakly p-synchronizing, or strongly p-synchronizing if, respectively, all, some, infinitely many, or all but finitely many distributions in the sequence are p-synchronizing. For each synchronizing mode, an MDP can be (i) sure winning if there is a strategy that produces a 1-synchronizing sequence; (ii) almost-sure winning if there is a strategy that produces a sequence that is, for all epsilon > 0, a (1-epsilon)-synchronizing sequence; (iii) limit-sure winning if for all epsilon > 0, there is a strategy that produces a (1-epsilon)-synchronizing sequence. We provide fundamental results on the expressiveness, decidability, and complexity of synchronizing properties for MDPs. For each synchronizing mode, we consider the problem of deciding whether an MDP is sure, almost-sure, or limit-sure winning, and we establish matching upper and lower complexity bounds of the problems: for all winning modes, we show that the problems are PSPACE-complete for eventually and weakly synchronizing, and PTIME-complete for always and strongly synchronizing. We establish the memory requirement for winning strategies, and we show that all winning modes coincide for always synchronizing, and that the almost-sure and limit-sure winning modes coincide for weakly and strongly synchronizing.

قيم البحث

360 - Laurent Doyen 2011

We introduce synchronizing objectives for Markov decision processes (MDP). Intuitively, a synchronizing objective requires that eventually, at every step there is a state which concentrates almost all the probability mass. In particular, it implies t hat the probabilistic system behaves in the long run like a deterministic system: eventually, the current state of the MDP can be identified with almost certainty. We study the problem of deciding the existence of a strategy to enforce a synchronizing objective in MDPs. We show that the problem is decidable for general strategies, as well as for blind strategies where the player cannot observe the current state of the MDP. We also show that pure strategies are sufficient, but memory may be necessary.

المنطق في علوم الحاسوب التعقيد الحسابي

A Faster-Than Relation for Semi-Markov Decision Processes

60 - Mathias Ruggaard Pedersen 2018

When modeling concurrent or cyber-physical systems, non-functional requirements such as time are important to consider. In order to improve the timing aspects of a model, it is necessary to have some notion of what it means for a process to be faster than another, which can guide the stepwise refinement of the model. To this end we study a faster-than relation for semi-Markov decision processes and compare it to standard notions for relating systems. We consider the compositional aspects of this relation, and show that the faster-than relation is not a precongruence with respect to parallel composition, hence giving rise to so-called parallel timing anomalies. We take the first steps toward understanding this problem by identifying decidable conditions sufficient to avoid parallel timing anomalies in the absence of non-determinism.

اللغات الرسمية ونظرية الأتومات

The Complexity of Reachability in Parametric Markov Decision Processes

240 - Sebastian Junges , Joost-Pieter Katoen , Guillermo A. Perez 2020

This article presents the complexity of reachability decision problems for parametric Markov decision processes (pMDPs), an extension to Markov decision processes (MDPs) where transitions probabilities are described by polynomials over a finite set o f parameters. In particular, we study the complexity of finding values for these parameters such that the induced MDP satisfies some maximal or minimal reachability probability constraints. We discuss different variants depending on the comparison operator in the constraints and the domain of the parameter values. We improve all known lower bounds for this problem, and notably provide ETR-completeness results for distinct variants of this problem.

المنطق في علوم الحاسوب

Model-free Reinforcement Learning for Branching Markov Decision Processes

127 - Ernst Moritz Hahn , Mateo Perez , Sven Schewe 2021

We study reinforcement learning for the optimal control of Branching Markov Decision Processes (BMDPs), a natural extension of (multitype) Branching Markov Chains (BMCs). The state of a (discrete-time) BMCs is a collection of entities of various type s that, while spawning other entities, generate a payoff. In comparison with BMCs, where the evolution of a each entity of the same type follows the same probabilistic pattern, BMDPs allow an external controller to pick from a range of options. This permits us to study the best/worst behaviour of the system. We generalise model-free reinforcement learning techniques to compute an optimal control strategy of an unknown BMDP in the limit. We present results of an implementation that demonstrate the practicality of the approach.

التعلم الآلي المنطق في علوم الحاسوب أنظمة وتحكم

On the Complexity of Reachability in Parametric Markov Decision Processes

308 - Tobias Winkler , Sebastian Junges , Guillermo A. Perez 2019

This paper studies parametric Markov decision processes (pMDPs), an extension to Markov decision processes (MDPs) where transitions probabilities are described by polynomials over a finite set of parameters. Fixing values for all parameters yields MD Ps. In particular, this paper studies the complexity of finding values for these parameters such that the induced MDP satisfies some reachability constraints. We discuss different variants depending on the comparison operator in the constraints and the domain of the parameter values. We improve all known lower bounds for this problem, and notably provide ETR-completeness results for distinct variants of this problem. Furthermore, we provide insights in the functions describing the induced reachability probabilities, and how pMDPs generalise concurrent stochastic reachability games.

المنطق في علوم الحاسوب التعقيد الحسابي

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

المعهد الوطني للإدارة العامة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

The Complexity of Synchronizing Markov Decision Processes

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً