ﻻ يوجد ملخص باللغة العربية
We present a new, tractable method for solving and analyzing risk-aware control problems over finite and infinite, discounted time-horizons where the dynamics of the controlled process are described as a martingale problem. Supposing general Polish state and action spaces, and using generalized, relaxed controls, we state a risk-aware dynamic optimal control problem of minimizing risk of costs described by a generic risk function. We then construct an alternative formulation that takes the form of a nonlinear programming problem, constrained by the dynamic, {i.e.} time-dependent, and linear Kolmogorov forward equation describing the distribution of the state and accumulated costs. We show that the formulations are equivalent, and that the optimal control process can be taken to be Markov in the controlled process state, running costs, and time. We further prove that under additional conditions, the optimal value is attained. An example numeric problem is presented and solved.
Despite the simplicity and intuitive interpretation of Minimum Mean Squared Error (MMSE) estimators, their effectiveness in certain scenarios is questionable. Indeed, minimizing squared errors on average does not provide any form of stability, as the
We present Free-MESSAGEp, the first zeroth-order algorithm for convex mean-semideviation-based risk-aware learning, which is also the first three-level zeroth-order compositional stochastic optimization algorithm, whatsoever. Using a non-trivial exte
We prove a duality relation and an integration by parts formula for fractional operators with a general analytical kernel. Based on these basic results, we are able to prove a new Gronwalls inequality and continuity and differentiability of solutions
We derive equivalent linear and dynamic programs for infinite horizon risk-sensitive control for minimization of the asymptotic growth rate of the cumulative cost.
In this paper, we focus on solving a class of constrained non-convex non-concave saddle point problems in a decentralized manner by a group of nodes in a network. Specifically, we assume that each node has access to a summand of a global objective fu