ترغب بنشر مسار تعليمي؟ اضغط هنا

Markov Chains with Maximum Return Time Entropy for Robotic Surveillance

105   0   0.0 ( 0 )
 نشر من قبل Xiaoming Duan
 تاريخ النشر 2018
  مجال البحث
والبحث باللغة English




اسأل ChatGPT حول البحث

Motivated by robotic surveillance applications, this paper studies the novel problem of maximizing the return time entropy of a Markov chain, subject to a graph topology with travel times and stationary distribution. The return time entropy is the weighted average, over all graph nodes, of the entropy of the first return times of the Markov chain; this objective function is a function series that does not admit in general a closed form. The paper features theoretical and computational contributions. First, we obtain a discrete-time delayed linear system for the return time probability distribution and establish its convergence properties. We show that the objective function is continuous over a compact set and therefore admits a global maximum; a unique globally-optimal solution is known only for complete graphs with unitary travel times. We then establish upper and lower bounds between the return time entropy and the well-known entropy rate of the Markov chain. To compute the optimal Markov chain numerically, we establish the asymptotic equality between entropy, conditional entropy and truncated entropy, and propose an iteration to compute the gradient of the truncated entropy. Finally, we apply these results to the robotic surveillance problem. Our numerical results show that, for a model of rational intruder over prototypical graph topologies and test cases, the maximum return time entropy chain performs better than several existing Markov chains.



قيم البحث

اقرأ أيضاً

This article surveys recent advancements of strategy designs for persistent robotic surveillance tasks with the focus on stochastic approaches. The problem describes how mobile robots stochastically patrol a graph in an efficient way where the effici ency is defined with respect to relevant underlying performance metrics. We first start by reviewing the basics of Markov chains, which is the primary motion model for stochastic robotic surveillance. Then two main criteria regarding the speed and unpredictability of surveillance strategies are discussed. The central objects that appear throughout the treatment is the hitting times of Markov chains, their distributions and expectations. We formulate various optimization problems based on the concerned metrics in different scenarios and establish their respective properties.
Many modern techniques employed in physics, such a computation of path integrals, rely on random walks on graphs that can be represented as Markov chains. Traditionally, estimates of running times of such sampling algorithms are computed using the nu mber of steps in the chain needed to reach the stationary distribution. This quantity is generally defined as mixing time and is often difficult to compute. In this paper, we suggest an alternative estimate based on the Kolmogorov-Sinai entropy, by establishing a link between the maximization of KSE and the minimization of the mixing time. Since KSE are easier to compute in general than mixing time, this link provides a new faster method to approximate the minimum mixing time that could be interesting in computer sciences and statistical physics. Beyond this, our finding will also be of interest to the out-of-equilibrium community, by providing a new rational to select stationary states in out-of-equilibrium physics: it seems reasonable that in a physical system with two simultaneous equiprobable possible dynamics, the final stationary state will be closer to the stationary state corresponding to the fastest dynamics (smallest mixing time).Through the empirical link found in this letter, this state will correspond to a state of maximal Kolmogorov-Sinai entropy. If this is true, this would provide a more satisfying rule for selecting stationary states in complex systems such as climate than the maximization of the entropy production.
This paper studies a stochastic robotic surveillance problem where a mobile robot moves randomly on a graph to capture a potential intruder that strategically attacks a location on the graph. The intruder is assumed to be omniscient: it knows the cur rent location of the mobile agent and can learn the surveillance strategy. The goal for the mobile robot is to design a stochastic strategy so as to maximize the probability of capturing the intruder. We model the strategic interactions between the surveillance robot and the intruder as a Stackelberg game, and optimal and suboptimal Markov chain based surveillance strategies in star, complete and line graphs are studied. We first derive a universal upper bound on the capture probability, i.e., the performance limit for the surveillance agent. We show that this upper bound is tight in the complete graph and further provide suboptimality guarantees for a natural design. For the star and line graphs, we first characterize dominant strategies for the surveillance agent and the intruder. Then, we rigorously prove the optimal strategy for the surveillance agent.
We study the problem of synthesizing a controller that maximizes the entropy of a partially observable Markov decision process (POMDP) subject to a constraint on the expected total reward. Such a controller minimizes the predictability of an agents t rajectories to an outside observer while guaranteeing the completion of a task expressed by a reward function. We first prove that an agent with partial observations can achieve an entropy at most as well as an agent with perfect observations. Then, focusing on finite-state controllers (FSCs) with deterministic memory transitions, we show that the maximum entropy of a POMDP is lower bounded by the maximum entropy of the parametric Markov chain (pMC) induced by such FSCs. This relationship allows us to recast the entropy maximization problem as a so-called parameter synthesis problem for the induced pMC. We then present an algorithm to synthesize an FSC that locally maximizes the entropy of a POMDP over FSCs with the same number of memory states. In numerical examples, we illustrate the relationship between the maximum entropy, the number of memory states in the FSC, and the expected reward.
Continuous-time Markov chains are mathematical models that are used to describe the state-evolution of dynamical systems under stochastic uncertainty, and have found widespread applications in various fields. In order to make these models computation ally tractable, they rely on a number of assumptions that may not be realistic for the domain of application; in particular, the ability to provide exact numerical parameter assessments, and the applicability of time-homogeneity and the eponymous Markov property. In this work, we extend these models to imprecise continuous-time Markov chains (ICTMCs), which are a robust generalisation that relaxes these assumptions while remaining computationally tractable. More technically, an ICTMC is a set of precise continuous-time finite-state stochastic processes, and rather than computing expected values of functions, we seek to compute lower expectations, which are tight lower bounds on the expectations that correspond to such a set of precise models. Note that, in contrast to e.g. Bayesian methods, all the elements of such a set are treated on equal grounds; we do not consider a distribution over this set. The first part of this paper develops a formalism for describing continuous-time finite-state stochastic processes that does not require the aforementioned simplifying assumptions. Next, this formalism is used to characterise ICTMCs and to investigate their properties. The concept of lower expectation is then given an alternative operator-theoretic characterisation, by means of a lower transition operator, and the properties of this operator are investigated as well. Finally, we use this lower transition operator to derive tractable algorithms (with polynomial runtime complexity w.r.t. the maximum numerical error) for computing the lower expectation of functions that depend on the state at any finite number of time points.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا