ترغب بنشر مسار تعليمي؟ اضغط هنا

Minimizing Spectral Risk Measures Applied to Markov Decision Processes

74   0   0.0 ( 0 )
 نشر من قبل Alexander Glauner
 تاريخ النشر 2020
  مجال البحث مالية
والبحث باللغة English




اسأل ChatGPT حول البحث

We study the minimization of a spectral risk measure of the total discounted cost generated by a Markov Decision Process (MDP) over a finite or infinite planning horizon. The MDP is assumed to have Borel state and action spaces and the cost function may be unbounded above. The optimization problem is split into two minimization problems using an infimum representation for spectral risk measures. We show that the inner minimization problem can be solved as an ordinary MDP on an extended state space and give sufficient conditions under which an optimal policy exists. Regarding the infinite dimensional outer minimization problem, we prove the existence of a solution and derive an algorithm for its numerical approximation. Our results include the findings in Bauerle and Ott (2011) in the special case that the risk measure is Expected Shortfall. As an application, we present a dynamic extension of the classical static optimal reinsurance problem, where an insurance company minimizes its cost of capital.



قيم البحث

اقرأ أيضاً

The paper analyzes risk assessment for cash flows in continuous time using the notion of convex risk measures for processes. By combining a decomposition result for optional measures, and a dual representation of a convex risk measure for bounded cd processes, we show that this framework provides a systematic approach to the both issues of model ambiguity, and uncertainty about the time value of money. We also establish a link between risk measures for processes and BSDEs.
We introduce and treat a class of Multi Objective Risk-Sensitive Markov Decision Processes (MORSMDPs), where the optimality criteria are generated by a multivariate utility function applied on a finite set of emph{different running costs}. To illustr ate our approach, we study the example of a two-armed bandit problem. In the sequel, we show that it is possible to reformulate standard Risk-Sensitive Partially Observable Markov Decision Processes (RSPOMDPs), where risk is modeled by a utility function that is a emph{sum of exponentials}, as MORSMDPs that can be solved with the methods described in the first part. This way, we extend the treatment of RSPOMDPs with exponential utility to RSPOMDPs corresponding to a qualitatively bigger family of utility functions.
In this paper, we study general monetary risk measures (without any convexity or weak convexity). A monetary (respectively, positively homogeneous) risk measure can be characterized as the lower envelope of a family of convex (respectively, coherent) risk measures. The proof does not depend on but easily leads to the classical representation theorems for convex and coherent risk measures. When the law-invariance and the SSD (second-order stochastic dominance)-consistency are involved, it is not the convexity (respectively, coherence) but the comonotonic convexity (respectively, comonotonic coherence) of risk measures that can be used for such kind of lower envelope characterizations in a unified form. The representation of a law-invariant risk measure in terms of VaR is provided.
In this paper we present an algorithm to compute risk averse policies in Markov Decision Processes (MDP) when the total cost criterion is used together with the average value at risk (AVaR) metric. Risk averse policies are needed when large deviation s from the expected behavior may have detrimental effects, and conventional MDP algorithms usually ignore this aspect. We provide conditions for the structure of the underlying MDP ensuring that approximations for the exact problem can be derived and solved efficiently. Our findings are novel inasmuch as average value at risk has not previously been considered in association with the total cost criterion. Our method is demonstrated in a rapid deployment scenario, whereby a robot is tasked with the objective of reaching a target location within a temporal deadline where increased speed is associated with increased probability of failure. We demonstrate that the proposed algorithm not only produces a risk averse policy reducing the probability of exceeding the expected temporal deadline, but also provides the statistical distribution of costs, thus offering a valuable analysis tool.
In this paper we propose the notion of continuous-time dynamic spectral risk-measure (DSR). Adopting a Poisson random measure setting, we define this class of dynamic coherent risk-measures in terms of certain backward stochastic differential equatio ns. By establishing a functional limit theorem, we show that DSRs may be considered to be (strongly) time-consistent continuous-time extensions of iterated spectral risk-measures, which are obtained by iterating a given spectral risk-measure (such as Expected Shortfall) along a given time-grid. Specifically, we demonstrate that any DSR arises in the limit of a sequence of such iterated spectral risk-measures driven by lattice-random walks, under suitable scaling and vanishing time- and spatial-mesh sizes. To illustrate its use in financial optimisation problems, we analyse a dynamic portfolio optimisation problem under a DSR.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا