In this paper, we study a stochastic recursive optimal control problem in which the value functional is defined by the solution of a backward stochastic differential equation (BSDE) under $tilde{G}$-expectation. Under standard assumptions, we establish the comparison theorem for this kind of BSDE and give a novel and simple method to obtain the dynamic programming principle. Finally, we prove that the value function is the unique viscosity solution of a type of fully nonlinear HJB equation.
A tensor decomposition approach for the solution of high-dimensional, fully nonlinear Hamilton-Jacobi-Bellman equations arising in optimal feedback control of nonlinear dynamics is presented. The method combines a tensor train approximation for the value function together with a Newton-like iterative method for the solution of the resulting nonlinear system. The tensor approximation leads to a polynomial scaling with respect to the dimension, partially circumventing the curse of dimensionality. A convergence analysis for the linear-quadratic case is presented. For nonlinear dynamics, the effectiveness of the high-dimensional control synthesis method is assessed in the optimal feedback stabilization of the Allen-Cahn and Fokker-Planck equations with a hundred of variables.
Computing optimal feedback controls for nonlinear systems generally requires solving Hamilton-Jacobi-Bellman (HJB) equations, which are notoriously difficult when the state dimension is large. Existing strategies for high-dimensional problems often rely on specific, restrictive problem structures, or are valid only locally around some nominal trajectory. In this paper, we propose a data-driven method to approximate semi-global solutions to HJB equations for general high-dimensional nonlinear systems and compute candidate optimal feedback controls in real-time. To accomplish this, we model solutions to HJB equations with neural networks (NNs) trained on data generated without discretizing the state space. Training is made more effective and data-efficient by leveraging the known physics of the problem and using the partially-trained NN to aid in adaptive data generation. We demonstrate the effectiveness of our method by learning solutions to HJB equations corresponding to the attitude control of a six-dimensional nonlinear rigid body, and nonlinear systems of dimension up to 30 arising from the stabilization of a Burgers-type partial differential equation. The trained NNs are then used for real-time feedback control of these systems.
Policy iteration is a widely used technique to solve the Hamilton Jacobi Bellman (HJB) equation, which arises from nonlinear optimal feedback control theory. Its convergence analysis has attracted much attention in the unconstrained case. Here we analyze the case with control constraints both for the HJB equations which arise in deterministic and in stochastic control cases. The linear equations in each iteration step are solved by an implicit upwind scheme. Numerical examples are conducted to solve the HJB equation with control constraints and comparisons are shown with the unconstrained cases.
We prove existence and uniqueness of Crandall-Lions viscosity solutions of Hamilton-Jacobi-Bellman equations in the space of continuous paths, associated to the optimal control of path-dependent SDEs. This seems the first uniqueness result in such a context. More precisely, similarly to the seminal paper of P.L. Lions, the proof of our core result, that is the comparison theorem, is based on the fact that the value function is bigger than any viscosity subsolution and smaller than any viscosity supersolution. Such a result, coupled with the proof that the value function is a viscosity solution (based on the dynamic programming principle, which we prove), implies that the value function is the unique viscosity solution to the Hamilton-Jacobi-Bellman equation. The proof of the comparison theorem in P.L. Lions paper, relies on regularity results which are missing in the present infinite-dimensional context, as well as on the local compactness of the finite-dimensional underlying space. We overcome such non-trivial technical difficulties introducing a suitable approximating procedure and a smooth gauge-type function, which allows to generate maxima and minima through an appropriate version of the Borwein-Preiss generalization of Ekelands variational principle on the space of continuous paths.
A novel method for computing reachable sets is proposed in this paper. In the proposed method, a Hamilton-Jacobi-Bellman equation with running cost functionis numerically solved and the reachable sets of different time horizons are characterized by a family of non-zero level sets of the solution of the Hamilton-Jacobi-Bellman equation. In addition to the classical reachable set, by setting different running cost functions and terminal conditionsof the Hamilton-Jacobi-Bellman equation, the proposed method allows to compute more generalized reachable sets, which are referred to as cost-limited reachable sets. In order to overcome the difficulty of solving the Hamilton-Jacobi-Bellman equation caused by the discontinuity of the solution, a method based on recursion and grid interpolation is employed. At the end of this paper, some examples are taken to illustrate the validity and generality of the proposed method.
Mingshang Hu
,Shaolin Ji
,Xiaojuan Li
.
(2021)
.
"Dynamic programming principle and Hamilton-Jacobi-Bellman equation under nonlinear expectation"
.
Xiaojuan Li
هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا