No Arabic abstract
The standard approach to risk-averse control is to use the Exponential Utility (EU) functional, which has been studied for several decades. Like other risk-averse utility functionals, EU encodes risk aversion through an increasing convex mapping $varphi$ of objective costs to subjective costs. An objective cost is a realization $y$ of a random variable $Y$. In contrast, a subjective cost is a realization $varphi(y)$ of a random variable $varphi(Y)$ that has been transformed to measure preferences about the outcomes. For EU, the transformation is $varphi(y) = exp(frac{-theta}{2}y)$, and under certain conditions, the quantity $varphi^{-1}(E(varphi(Y)))$ can be approximated by a linear combination of the mean and variance of $Y$. More recently, there has been growing interest in risk-averse control using the Conditional Value-at-Risk (CVaR) functional. In contrast to the EU functional, the CVaR of a random variable $Y$ concerns a fraction of its possible realizations. If $Y$ is a continuous random variable with finite $E(|Y|)$, then the CVaR of $Y$ at level $alpha$ is the expectation of $Y$ in the $alpha cdot 100 %$ worst cases. Here, we study the applications of risk-averse functionals to controller synthesis and safety analysis through the development of numerical examples, with emphasis on EU and CVaR. Our contribution is to examine the decision-theoretic, mathematical, and computational trade-offs that arise when using EU and CVaR for optimal control and safety analysis. We are hopeful that this work will advance the interpretability and elucidate the potential benefits of risk-averse control technology.
This paper develops a safety analysis method for stochastic systems that is sensitive to the possibility and severity of rare harmful outcomes. We define risk-sensitive safe sets as sub-level sets of the solution to a non-standard optimal control problem, where a random maximum cost is assessed using the Conditional Value-at-Risk (CVaR) functional. The solution to the control problem represents the maximum extent of constraint violation of the state trajectory, averaged over the $alphacdot 100$% worst cases, where $alpha in (0,1]$. This problem is well-motivated but difficult to solve in a tractable fashion because temporal decompositions for risk functionals generally depend on the history of the systems behavior. Our primary theoretical contribution is to derive under-approximations to risk-sensitive safe sets, which are computationally tractable. Our method provides a novel, theoretically guaranteed, parameter-dependent upper bound to the CVaR of a maximum cost without the need to augment the state space. For a fixed parameter value, the solution to only one Markov decision process problem is required to obtain the under-approximations for any family of risk-sensitivity levels. In addition, we propose a second definition for risk-sensitive safe sets and provide a tractable method for their estimation without using a parameter-dependent upper bound. The second definition is expressed in terms of a new coherent risk functional, which is inspired by CVaR. We demonstrate our primary theoretical contribution using numerical examples of a thermostatically controlled load system and a stormwater system.
This paper proposes a safety analysis method that facilitates a tunable balance between the worst-case and risk-neutral perspectives. First, we define a risk-sensitive safe set to specify the degree of safety attained by a stochastic system. This set is defined as a sublevel set of the solution to an optimal control problem that is expressed using the Conditional Value-at-Risk (CVaR) measure. This problem does not satisfy Bellmans Principle, thus our next contribution is to show how risk-sensitive safe sets can be under-approximated by the solution to a CVaR-Markov Decision Process. We adopt an existing value iteration algorithm to find an approximate solution to the reduced problem for a class of linear systems. Then, we develop a realistic numerical example of a stormwater system to show that this approach can be applied to non-linear systems. Finally, we compare the CVaR criterion to the exponential disutility criterion. The latter allocates control effort evenly across the cost distribution to reduce variance, while the CVaR criterion focuses control effort on a given worst-case quantile--where it matters most for safety.
We study a risk-averse optimal control problem with a finite-horizon Borel model, where the cost is assessed via exponential utility. The setting permits non-linear dynamics, non-quadratic costs, and continuous spaces but is less general than the problem of optimizing an expected utility. Our contribution is to show the existence of an optimal risk-averse controller through the use of measure-theoretic first principles.
We consider the stochastic shortest path planning problem in MDPs, i.e., the problem of designing policies that ensure reaching a goal state from a given initial state with minimum accrued cost. In order to account for rare but important realizations of the system, we consider a nested dynamic coherent risk total cost functional rather than the conventional risk-neutral total expected cost. Under some assumptions, we show that optimal, stationary, Markovian policies exist and can be found via a special Bellmans equation. We propose a computational technique based on difference convex programs (DCPs) to find the associated value functions and therefore the risk-averse policies. A rover navigation MDP is used to illustrate the proposed methodology with conditional-value-at-risk (CVaR) and entropic-value-at-risk (EVaR) coherent risk measures.
We propose a new risk-constrained reformulation of the standard Linear Quadratic Regulator (LQR) problem. Our framework is motivated by the fact that the classical (risk-neutral) LQR controller, although optimal in expectation, might be ineffective under relatively infrequent, yet statistically significant (risky) events. To effectively trade between average and extreme event performance, we introduce a new risk constraint, which explicitly restricts the total expected predictive variance of the state penalty by a user-prescribed level. We show that, under rather minimal conditions on the process noise (i.e., finite fourth-order moments), the optimal risk-aware controller can be evaluated explicitly and in closed form. In fact, it is affine relative to the state, and is always internally stable regardless of parameter tuning. Our new risk-aware controller: i) pushes the state away from directions where the noise exhibits heavy tails, by exploiting the third-order moment (skewness) of the noise; ii) inflates the state penalty in riskier directions, where both the noise covariance and the state penalty are simultaneously large. The properties of the proposed risk-aware LQR framework are also illustrated via indicative numerical examples.