Subscribe to the gold package and get unlimited access to Shamra Academy

Markov Automata with Multiple Objectives

110 0 0.0 ( 0 )

Download Cite

Added by Tim Quatmann

Publication date 2017

fields Informatics Engineering

and research's language is English

Authors Tim Quatmann - Sebastian Junges - Joost-Pieter Katoen

Logic in Computer Science

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Markov automata combine non-determinism, probabilistic branching, and exponentially distributed delays. This compositional variant of continuous-time Markov decision processes is used in reliability engineering, performance evaluation and stochastic scheduling. Their verification so far focused on single objectives such as (timed) reachability, and expected costs. In practice, often the objectives are mutually dependent and the aim is to reveal trade-offs. We present algorithms to analyze several objectives simultaneously and approximate Pareto curves. This includes, e.g., several (timed) reachability objectives, or various expected cost objectives. We also consider combinations thereof, such as on-time-within-budget objectives - which policies guarantee reaching a goal state within a deadline with at least probability $p$ while keeping the allowed average costs below a threshold? We adopt existing approaches for classical Markov decision processes. The main challenge is to treat policies exploiting state residence times, even for untimed objectives. Experimental results show the feasibility and scalability of our approach.

rate research

Synchronizing Objectives for Markov Decision Processes

392 - Laurent Doyen 2011

We introduce synchronizing objectives for Markov decision processes (MDP). Intuitively, a synchronizing objective requires that eventually, at every step there is a state which concentrates almost all the probability mass. In particular, it implies that the probabilistic system behaves in the long run like a deterministic system: eventually, the current state of the MDP can be identified with almost certainty. We study the problem of deciding the existence of a strategy to enforce a synchronizing objective in MDPs. We show that the problem is decidable for general strategies, as well as for blind strategies where the player cannot observe the current state of the MDP. We also show that pure strategies are sufficient, but memory may be necessary.

Logic in Computer Science Computational Complexity

Expected-Delay-Summing Weak Bisimilarity for Markov Automata

75 - Alessandro Aldini 2015

A new weak bisimulation semantics is defined for Markov automata that, in addition to abstracting from internal actions, sums up the expected values of consecutive exponentially distributed delays possibly intertwined with internal actions. The resulting equivalence is shown to be a congruence with respect to parallel composition for Markov automata. Moreover, it turns out to be comparable with weak bisimilarity for timed labeled transition systems, thus constituting a step towards reconciling the semantics for stochastic time and deterministic time.

Logic in Computer Science

Tree games with regular objectives

463 - Marcin Przyby{l}ko 2014

We study tree games developed recently by Matteo Mio as a game interpretation of the probabilistic $mu$-calculus. With expressive power comes complexity. Mio showed that tree games are able to encode Blackwell games and, consequently, are not determined under deterministic strategies. We show that non-stochastic tree games with objectives recognisable by so-called game automata are determined under deterministic, finite memory strategies. Moreover, we give an elementary algorithmic procedure which, for an arbitrary regular language L and a finite non-stochastic tree game with a winning objective L decides if the game is determined under deterministic strategies.

Logic in Computer Science Formal Languages and Automata Theory Computer Science and Game Theory

Reward Shaping for Reinforcement Learning with Omega-Regular Objectives

83 - E. M. Hahn , M. Perez , S. Schewe 2020

Recently, successful approaches have been made to exploit good-for-MDPs automata (Buchi automata with a restricted form of nondeterminism) for model free reinforcement learning, a class of automata that subsumes good for games automata and the most widespread class of limit deterministic automata. The foundation of using these Buchi automata is that the Buchi condition can, for good-for-MDP automata, be translated to reachability. The drawback of this translation is that the rewards are, on average, reaped very late, which requires long episodes during the learning process. We devise a new reward shaping approach that overcomes this issue. We show that the resulting model is equivalent to a discounted payoff objective with a biased discount that simplifies and improves on prior work in this direction.

Logic in Computer Science Machine Learning

Timed Automata with Polynomial Delay and their Expressiveness

78 - Valentin Bura , Tim French , Mark Reynolds 2017

We consider previous models of Timed, Probabilistic and Stochastic Timed Automata, we introduce our model of Timed Automata with Polynomial Delay and we characterize the expressiveness of these models relative to each other.

Logic in Computer Science

comments

Fetching comments

Sohag University

Additional details More universities

Markov Automata with Multiple Objectives

Ask ChatGPT about the research

No Arabic abstract

Read More