No Arabic abstract
In this paper, we aim to solve the high dimensional stochastic optimal control problem from the view of the stochastic maximum principle via deep learning. By introducing the extended Hamiltonian system which is essentially an FBSDE with a maximum condition, we reformulate the original control problem as a new one. Three algorithms are proposed to solve the new control problem. Numerical results for different examples demonstrate the effectiveness of our proposed algorithms, especially in high dimensional cases. And an important application of this method is to calculate the sub-linear expectations, which correspond to a kind of fully nonlinear PDEs.
This paper applies a reinforcement learning (RL) method to solve infinite horizon continuous-time stochastic linear quadratic problems, where drift and diffusion terms in the dynamics may depend on both the state and control. Based on Bellmans dynamic programming principle, an online RL algorithm is presented to attain the optimal control with just partial system information. This algorithm directly computes the optimal control rather than estimating the system coefficients and solving the related Riccati equation. It just requires local trajectory information, greatly simplifying the calculation processing. Two numerical examples are carried out to shed light on our theoretical findings.
In this paper, the optimal control problem of neutral stochastic functional differential equation (NSFDE) is discussed. A class of so-called neutral backward stochastic functional equations of Volterra type (VNBSFEs) are introduced as the adjoint equation. The existence and uniqueness of VNBSFE is established. The Pontryagin maximum principle is constructed for controlled NSFDE with Lagrange type cost functional.
In this paper, we study a partially observed progressive optimal control problem of forward-backward stochastic differential equations with random jumps, where the control domain is not necessarily convex, and the control variable enter into all the coefficients. In our model, the observation equation is not only driven by a Brownian motion but also a Poisson random measure, which also have correlated noises with the state equation. For preparation, we first derive the existence and uniqueness of the solutions to the fully coupled forward-backward stochastic system with random jumps in $L^2$-space and the decoupled forward-backward stochastic system with random jumps in $L^beta(beta>2)$-space, respectively, then we obtain the $L^beta(betageq2)$-estimation of solutions to the fully coupled forward-backward stochastic system, and the non-linear filtering equation of partially observed stochastic system with random jumps. Then we derive the partially observed global maximum principle with random jumps with a new hierarchical method. To show its applications, a partially observed linear quadratic progressive optimal control problem with random jumps is investigated, by the maximum principle and stochastic filtering. State estimate feedback representation of the optimal control is given in a more explicit form by introducing some ordinary differential equations.
In this paper, we investigate a sparse optimal control of continuous-time stochastic systems. We adopt the dynamic programming approach and analyze the optimal control via the value function. Due to the non-smoothness of the $L^0$ cost functional, in general, the value function is not differentiable in the domain. Then, we characterize the value function as a viscosity solution to the associated Hamilton-Jacobi-Bellman (HJB) equation. Based on the result, we derive a necessary and sufficient condition for the $L^0$ optimality, which immediately gives the optimal feedback map. Especially for control-affine systems, we consider the relationship with $L^1$ optimal control problem and show an equivalence theorem.
We present a probabilistic formulation of risk aware optimal control problems for stochastic differential equations. Risk awareness is in our framework captured by objective functions in which the risk neutral expectation is replaced by a risk function, a nonlinear functional of random variables that account for the controllers risk preferences. We state and prove a risk aware minimum principle that is a parsimonious generalization of the well-known risk neutral, stochastic Pontryagins minimum principle. As our main results we give necessary and also sufficient conditions for optimality of control processes taking values on probability measures defined on a given action space. We show that remarkably, going from the risk neutral to the risk aware case, the minimum principle is simply modified by the introduction of one additional real-valued stochastic process that acts as a risk adjustment factor for given cost rate and terminal cost functions. This adjustment process is explicitly given as the expectation, conditional on the filtration at the given time, of an appropriately defined functional derivative of the risk function evaluated at the random total cost. For our results we rely on the Frechet differentiability of the risk function, and for completeness, we prove under mild assumptions the existence of Frechet derivatives of some common risk functions. We give a simple application of the results for a portfolio allocation problem and show that the risk awareness of the objective function gives rise to a risk premium term that is characterized by the risk adjustment process described above. This suggests uses of our results in e.g. pricing of risk modeled by generic risk functions in financial applications.