بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Markov Automata with Multiple Objectives

110 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Tim Quatmann

تاريخ النشر 2017

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Tim Quatmann - Sebastian Junges - Joost-Pieter Katoen

المنطق في علوم الحاسوب

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Markov automata combine non-determinism, probabilistic branching, and exponentially distributed delays. This compositional variant of continuous-time Markov decision processes is used in reliability engineering, performance evaluation and stochastic scheduling. Their verification so far focused on single objectives such as (timed) reachability, and expected costs. In practice, often the objectives are mutually dependent and the aim is to reveal trade-offs. We present algorithms to analyze several objectives simultaneously and approximate Pareto curves. This includes, e.g., several (timed) reachability objectives, or various expected cost objectives. We also consider combinations thereof, such as on-time-within-budget objectives - which policies guarantee reaching a goal state within a deadline with at least probability $p$ while keeping the allowed average costs below a threshold? We adopt existing approaches for classical Markov decision processes. The main challenge is to treat policies exploiting state residence times, even for untimed objectives. Experimental results show the feasibility and scalability of our approach.

قيم البحث

363 - Laurent Doyen 2011

We introduce synchronizing objectives for Markov decision processes (MDP). Intuitively, a synchronizing objective requires that eventually, at every step there is a state which concentrates almost all the probability mass. In particular, it implies t hat the probabilistic system behaves in the long run like a deterministic system: eventually, the current state of the MDP can be identified with almost certainty. We study the problem of deciding the existence of a strategy to enforce a synchronizing objective in MDPs. We show that the problem is decidable for general strategies, as well as for blind strategies where the player cannot observe the current state of the MDP. We also show that pure strategies are sufficient, but memory may be necessary.

المنطق في علوم الحاسوب التعقيد الحسابي

Expected-Delay-Summing Weak Bisimilarity for Markov Automata

75 - Alessandro Aldini 2015

A new weak bisimulation semantics is defined for Markov automata that, in addition to abstracting from internal actions, sums up the expected values of consecutive exponentially distributed delays possibly intertwined with internal actions. The resul ting equivalence is shown to be a congruence with respect to parallel composition for Markov automata. Moreover, it turns out to be comparable with weak bisimilarity for timed labeled transition systems, thus constituting a step towards reconciling the semantics for stochastic time and deterministic time.

المنطق في علوم الحاسوب

Tree games with regular objectives

439 - Marcin Przyby{l}ko 2014

We study tree games developed recently by Matteo Mio as a game interpretation of the probabilistic $mu$-calculus. With expressive power comes complexity. Mio showed that tree games are able to encode Blackwell games and, consequently, are not determi ned under deterministic strategies. We show that non-stochastic tree games with objectives recognisable by so-called game automata are determined under deterministic, finite memory strategies. Moreover, we give an elementary algorithmic procedure which, for an arbitrary regular language L and a finite non-stochastic tree game with a winning objective L decides if the game is determined under deterministic strategies.

المنطق في علوم الحاسوب اللغات الرسمية ونظرية الأتومات علوم الكمبيوتر ونظرية الألعاب

Reward Shaping for Reinforcement Learning with Omega-Regular Objectives

83 - E. M. Hahn , M. Perez , S. Schewe 2020

Recently, successful approaches have been made to exploit good-for-MDPs automata (Buchi automata with a restricted form of nondeterminism) for model free reinforcement learning, a class of automata that subsumes good for games automata and the most w idespread class of limit deterministic automata. The foundation of using these Buchi automata is that the Buchi condition can, for good-for-MDP automata, be translated to reachability. The drawback of this translation is that the rewards are, on average, reaped very late, which requires long episodes during the learning process. We devise a new reward shaping approach that overcomes this issue. We show that the resulting model is equivalent to a discounted payoff objective with a biased discount that simplifies and improves on prior work in this direction.

المنطق في علوم الحاسوب التعلم الآلي

Timed Automata with Polynomial Delay and their Expressiveness

78 - Valentin Bura , Tim French , Mark Reynolds 2017

We consider previous models of Timed, Probabilistic and Stochastic Timed Automata, we introduce our model of Timed Automata with Polynomial Delay and we characterize the expressiveness of these models relative to each other.

المنطق في علوم الحاسوب

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة حماه

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Markov Automata with Multiple Objectives

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً