ترغب بنشر مسار تعليمي؟ اضغط هنا

Good-for-MDPs Automata for Probabilistic Analysis and Reinforcement Learning

86   0   0.0 ( 0 )
 نشر من قبل Ernst Moritz Hahn
 تاريخ النشر 2019
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

We characterize the class of nondeterministic ${omega}$-automata that can be used for the analysis of finite Markov decision processes (MDPs). We call these automata `good-for-MDPs (GFM). We show that GFM automata are closed under classic simulation as well as under more powerful simulation relations that leverage properties of optimal control strategies for MDPs. This closure enables us to exploit state-space reduction techniques, such as those based on direct and delayed simulation, that guarantee simulation equivalence. We demonstrate the promise of GFM automata by defining a new class of automata with favorable properties - they are Buchi automata with low branching degree obtained through a simple construction - and show that going beyond limit-deterministic automata may significantly benefit reinforcement learning.



قيم البحث

اقرأ أيضاً

The probabilistic bisimilarity distance of Deng et al. has been proposed as a robust quantitative generalization of Segala and Lynchs probabilistic bisimilarity for probabilistic automata. In this paper, we present a characterization of the bisimilar ity distance as the solution of a simple stochastic game. The characterization gives us an algorithm to compute the distances by applying Condons simple policy iteration on these games. The correctness of Condons approach, however, relies on the assumption that the games are stopping. Our games may be non-stopping in general, yet we are able to prove termination for this extended class of games. Already other algorithms have been proposed in the literature to compute these distances, with complexity in $textbf{UP} cap textbf{coUP}$ and textbf{PPAD}. Despite the theoretical relevance, these algorithms are inefficient in practice. To the best of our knowledge, our algorithm is the first practical solution. The characterization of the probabilistic bisimilarity distance mentioned above crucially uses a dual presentation of the Hausdorff distance due to Memoli. As an additional contribution, in this paper we show that Memolis result can be used also to prove that the bisimilarity distance bounds the difference in the maximal (or minimal) probability of two states to satisfying arbitrary $omega$-regular properties, expressed, eg., as LTL formulas.
In GFG automata, it is possible to resolve nondeterminism in a way that only depends on the past and still accepts all the words in the language. The motivation for GFG automata comes from their adequacy for games and synthesis, wherein general nonde terminism is inappropriate. We continue the ongoing effort of studying the power of nondeterminism in GFG automata. Initial indications have hinted that every GFG automaton embodies a deterministic one. Today we know that this is not the case, and in fact GFG automata may be exponentially more succinct than deterministic ones. We focus on the typeness question, namely the question of whether a GFG automaton with a certain acceptance condition has an equivalent GFG automaton with a weaker acceptance condition on the same structure. Beyond the theoretical interest in studying typeness, its existence implies efficient translations among different acceptance conditions. This practical issue is of special interest in the context of games, where the Buchi and co-Buchi conditions admit memoryless strategies for both players. Typeness is known to hold for deterministic automata and not to hold for general nondeterministic automata. We show that GFG automata enjoy the benefits of typeness, similarly to the case of deterministic automata. In particular, when Rabin or Streett GFG automata have equivalent Buchi or co-Buchi GFG automata, respectively, then such equivalent automata can be defined on a substructure of the original automata. Using our typeness results, we further study the place of GFG automata in between deterministic and nondeterministic ones. Specifically, considering automata complementation, we show that GFG automata lean toward nondeterministic ones, admitting an exponential state blow-up in the complementation of a Streett automaton into a Rabin automaton, as opposed to the constant blow-up in the deterministic case.
109 - Sven Schewe 2020
This paper discusses the hardness of finding minimal good-for-games (GFG) Buchi, Co-Buchi, and parity automata with state based acceptance. The problem appears to sit between finding small deterministic and finding small nondeterministic automata, wh ere minimality is NP-complete and PSPACE-complete, respectively. However, recent work of Radi and Kupferman has shown that minimising Co-Buchi automata with transition based acceptance is tractable, which suggests that the complexity of minimising GFG automata might be cheaper than minimising deterministic automata. We show for the standard state based acceptance that the minimality of a GFG automaton is NP-complete for Buchi, Co-Buchi, and parity GFG automata. The proofs are a surprisingly straight forward generalisation of the proofs from deterministic Buchi automata: they use a similar reductions, and the same hard class of languages.
In [1], we introduced the weakly synchronizing languages for probabilistic automata. In this report, we show that the emptiness problem of weakly synchronizing languages for probabilistic automata is undecidable. This implies that the decidability re sult of [1-3] for the emptiness problem of weakly synchronizing language is incorrect.
This short note aims at proving that the isolation problem is undecidable for probabilistic automata with only one probabilistic transition. This problem is known to be undecidable for general probabilistic automata, without restriction on the number of probabilistic transitions. In this note, we develop a simulation technique that allows to simulate any probabilistic automaton with one having only one probabilistic transition.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا