ترغب بنشر مسار تعليمي؟ اضغط هنا

A Risk-Sensitive Finite-Time Reachability Approach for Safety of Stochastic Dynamic Systems

136   0   0.0 ( 0 )
 نشر من قبل Margaret Chapman
 تاريخ النشر 2019
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

A classic reachability problem for safety of dynamic systems is to compute the set of initial states from which the state trajectory is guaranteed to stay inside a given constraint set over a given time horizon. In this paper, we leverage existing theory of reachability analysis and risk measures to devise a risk-sensitive reachability approach for safety of stochastic dynamic systems under non-adversarial disturbances over a finite time horizon. Specifically, we first introduce the notion of a risk-sensitive safe set as a set of initial states from which the risk of large constraint violations can be reduced to a required level via a control policy, where risk is quantified using the Conditional Value-at-Risk (CVaR) measure. Second, we show how the computation of a risk-sensitive safe set can be reduced to the solution to a Markov Decision Process (MDP), where cost is assessed according to CVaR. Third, leveraging this reduction, we devise a tractable algorithm to approximate a risk-sensitive safe set, and provide theoretical arguments about its correctness. Finally, we present a realistic example inspired from stormwater catchment design to demonstrate the utility of risk-sensitive reachability analysis. In particular, our approach allows a practitioner to tune the level of risk sensitivity from worst-case (which is typical for Hamilton-Jacobi reachability analysis) to risk-neutral (which is the case for stochastic reachability analysis).

قيم البحث

اقرأ أيضاً

This paper proposes a safety analysis method that facilitates a tunable balance between the worst-case and risk-neutral perspectives. First, we define a risk-sensitive safe set to specify the degree of safety attained by a stochastic system. This set is defined as a sublevel set of the solution to an optimal control problem that is expressed using the Conditional Value-at-Risk (CVaR) measure. This problem does not satisfy Bellmans Principle, thus our next contribution is to show how risk-sensitive safe sets can be under-approximated by the solution to a CVaR-Markov Decision Process. We adopt an existing value iteration algorithm to find an approximate solution to the reduced problem for a class of linear systems. Then, we develop a realistic numerical example of a stormwater system to show that this approach can be applied to non-linear systems. Finally, we compare the CVaR criterion to the exponential disutility criterion. The latter allocates control effort evenly across the cost distribution to reduce variance, while the CVaR criterion focuses control effort on a given worst-case quantile--where it matters most for safety.
Autonomous cyber-physical systems (CPS) rely on the correct operation of numerous components, with state-of-the-art methods relying on machine learning (ML) and artificial intelligence (AI) components in various stages of sensing and control. This pa per develops methods for estimating the reachable set and verifying safety properties of dynamical systems under control of neural network-based controllers that may be implemented in embedded software. The neural network controllers we consider are feedforward neural networks called multilayer perceptrons (MLP) with general activation functions. As such feedforward networks are memoryless, they may be abstractly represented as mathematical functions, and the reachability analysis of the network amounts to range (image) estimation of this function provided a set of inputs. By discretizing the input set of the MLP into a finite number of hyper-rectangular cells, our approach develops a linear programming (LP) based algorithm for over-approximating the output set of the MLP with its input set as a union of hyper-rectangular cells. Combining the over-approximation for the output set of an MLP based controller and reachable set computation routines for ordinary difference/differential equation (ODE) models, an algorithm is developed to estimate the reachable set of the closed-loop system. Finally, safety verification for neural network control systems can be performed by checking the existence of intersections between the estimated reachable set and unsafe regions. The approach is implemented in a computational software prototype and evaluated on numerical examples.
Trajectory optimization considers the problem of deciding how to control a dynamical system to move along a trajectory which minimizes some cost function. Differential Dynamic Programming (DDP) is an optimal control method which utilizes a second-ord er approximation of the problem to find the control. It is fast enough to allow real-time control and has been shown to work well for trajectory optimization in robotic systems. Here we extend classic DDP to systems with multiple time-delays in the state. Being able to find optimal trajectories for time-delayed systems with DDP opens up the possibility to use richer models for system identification and control, including recurrent neural networks with multiple timesteps in the state. We demonstrate the algorithm on a two-tank continuous stirred tank reactor. We also demonstrate the algorithm on a recurrent neural network trained to model an inverted pendulum with position information only.
This paper develops a safety analysis method for stochastic systems that is sensitive to the possibility and severity of rare harmful outcomes. We define risk-sensitive safe sets as sub-level sets of the solution to a non-standard optimal control pro blem, where a random maximum cost is assessed using the Conditional Value-at-Risk (CVaR) functional. The solution to the control problem represents the maximum extent of constraint violation of the state trajectory, averaged over the $alphacdot 100$% worst cases, where $alpha in (0,1]$. This problem is well-motivated but difficult to solve in a tractable fashion because temporal decompositions for risk functionals generally depend on the history of the systems behavior. Our primary theoretical contribution is to derive under-approximations to risk-sensitive safe sets, which are computationally tractable. Our method provides a novel, theoretically guaranteed, parameter-dependent upper bound to the CVaR of a maximum cost without the need to augment the state space. For a fixed parameter value, the solution to only one Markov decision process problem is required to obtain the under-approximations for any family of risk-sensitivity levels. In addition, we propose a second definition for risk-sensitive safe sets and provide a tractable method for their estimation without using a parameter-dependent upper bound. The second definition is expressed in terms of a new coherent risk functional, which is inspired by CVaR. We demonstrate our primary theoretical contribution using numerical examples of a thermostatically controlled load system and a stormwater system.
We study turn-based stochastic zero-sum games with lexicographic preferences over reachability and safety objectives. Stochastic games are standard models in control, verification, and synthesis of stochastic reactive systems that exhibit both random ness as well as angelic and demonic non-determinism. Lexicographic order allows to consider multiple objectives with a strict preference order over the satisfaction of the objectives. To the best of our knowledge, stochastic games with lexicographic objectives have not been studied before. We establish determinacy of such games and present strategy and computational complexity results. For strategy complexity, we show that lexicographically optimal strategies exist that are deterministic and memory is only required to remember the already satisfied and violated objectives. For a constant number of objectives, we show that the relevant decision problem is in NP $cap$ coNP, matching the current known bound for single objectives; and in general the decision problem is PSPACE-hard and can be solved in NEXPTIME $cap$ coNEXPTIME. We present an algorithm that computes the lexicographically optimal strategies via a reduction to computation of optimal strategies in a sequence of single-objectives games. We have implemented our algorithm and report experimental results on various case studies.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا