ترغب بنشر مسار تعليمي؟ اضغط هنا

Risk-Averse Planning Under Uncertainty

136   0   0.0 ( 0 )
 نشر من قبل Mohamadreza Ahmadi
 تاريخ النشر 2019
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

We consider the problem of designing policies for partially observable Markov decision processes (POMDPs) with dynamic coherent risk objectives. Synthesizing risk-averse optimal policies for POMDPs requires infinite memory and thus undecidable. To overcome this difficulty, we propose a method based on bounded policy iteration for designing stochastic but finite state (memory) controllers, which takes advantage of standard convex optimization methods. Given a memory budget and optimality criterion, the proposed method modifies the stochastic finite state controller leading to sub-optimal solutions with lower coherent risk.



قيم البحث

اقرأ أيضاً

We consider the stochastic shortest path planning problem in MDPs, i.e., the problem of designing policies that ensure reaching a goal state from a given initial state with minimum accrued cost. In order to account for rare but important realizations of the system, we consider a nested dynamic coherent risk total cost functional rather than the conventional risk-neutral total expected cost. Under some assumptions, we show that optimal, stationary, Markovian policies exist and can be found via a special Bellmans equation. We propose a computational technique based on difference convex programs (DCPs) to find the associated value functions and therefore the risk-averse policies. A rover navigation MDP is used to illustrate the proposed methodology with conditional-value-at-risk (CVaR) and entropic-value-at-risk (EVaR) coherent risk measures.
Collision avoidance is an essential concern for the autonomous operations of aerial vehicles in dynamic and uncertain urban environments. This paper introduces a risk-bounded path planning algorithm for unmanned aerial vehicles (UAVs) operating in su ch environments. This algorithm advances the rapidly-exploring random tree (RRT) with chance constraints to generate probabilistically guaranteed collision-free paths that are robust to vehicle and environmental obstacle uncertainties. Assuming all uncertainties follow Gaussian distributions, the chance constraints are established through converting dynamic and probabilistic constraints into equivalent static and deterministic constraints. By incorporating chance constraints into the RRT algorithm, the proposed algorithm not only inherits the computational advantage of sampling-based algorithms but also guarantees a probabilistically feasible flying zone at every time step. Simulation results show the promising performance of the proposed algorithm.
129 - Anastasis Kratsios 2019
This paper introduces an intermediary between conditional expectation and conditional sublinear expectation, called R-conditioning. The R-conditioning of a random-vector in $L^2$ is defined as the best $L^2$-estimate, given a $sigma$-subalgebra and a degree of model uncertainty. When the random vector represents the payoff of derivative security in a complete financial market, its R-conditioning with respect to the risk-neutral measure is interpreted as its risk-averse value. The optimization problem defining the optimization R-conditioning is shown to be well-posed. We show that the R-conditioning operators can be used to approximate a large class of sublinear expectations to arbitrary precision. We then introduce a novel numerical algorithm for computing the R-conditioning. This algorithm is shown to be strongly convergent. Implementations are used to compare the risk-averse value of a Vanilla option to its traditional risk-neutral value, within the Black-Scholes-Merton framework. Concrete connections to robust finance, sensitivity analysis, and high-dimensional estimation are all treated in this paper.
Imitation learning algorithms learn viable policies by imitating an experts behavior when reward signals are not available. Generative Adversarial Imitation Learning (GAIL) is a state-of-the-art algorithm for learning policies when the experts behavi or is available as a fixed set of trajectories. We evaluate in terms of the experts cost function and observe that the distribution of trajectory-costs is often more heavy-tailed for GAIL-agents than the expert at a number of benchmark continuous-control tasks. Thus, high-cost trajectories, corresponding to tail-end events of catastrophic failure, are more likely to be encountered by the GAIL-agents than the expert. This makes the reliability of GAIL-agents questionable when it comes to deployment in risk-sensitive applications like robotic surgery and autonomous driving. In this work, we aim to minimize the occurrence of tail-end events by minimizing tail risk within the GAIL framework. We quantify tail risk by the Conditional-Value-at-Risk (CVaR) of trajectories and develop the Risk-Averse Imitation Learning (RAIL) algorithm. We observe that the policies learned with RAIL show lower tail-end risk than those of vanilla GAIL. Thus the proposed RAIL algorithm appears as a potent alternative to GAIL for improved reliability in risk-sensitive applications.
This paper considers safe robot mission planning in uncertain dynamical environments. This problem arises in applications such as surveillance, emergency rescue, and autonomous driving. It is a challenging problem due to modeling and integrating dyna mical uncertainties into a safe planning framework, and finding a solution in a computationally tractable way. In this work, we first develop a probabilistic model for dynamical uncertainties. Then, we provide a framework to generate a path that maximizes safety for complex missions by incorporating the uncertainty model. We also devise a Monte Carlo method to obtain a safe path efficiently. Finally, we evaluate the performance of our approach and compare it to potential alternatives in several case studies.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا