ترغب بنشر مسار تعليمي؟ اضغط هنا

Dynamic Programming for General Linear Quadratic Optimal Stochastic Control with Random Coefficients

120   0   0.0 ( 0 )
 نشر من قبل Shanjian Tang
 تاريخ النشر 2014
  مجال البحث
والبحث باللغة English
 تأليف Shanjian Tang




اسأل ChatGPT حول البحث

We are concerned with the linear-quadratic optimal stochastic control problem with random coefficients. Under suitable conditions, we prove that the value field $V(t,x,omega), (t,x,omega)in [0,T]times R^ntimes Omega$, is quadratic in $x$, and has the following form: $V(t,x)=langle K_tx, xrangle$ where $K$ is an essentially bounded nonnegative symmetric matrix-valued adapted processes. Using the dynamic programming principle (DPP), we prove that $K$ is a continuous semi-martingale of the form $$K_t=K_0+int_0^t , dk_s+sum_{i=1}^dint_0^tL_s^i, dW_s^i, quad tin [0,T]$$ with $k$ being a continuous process of bounded variation and $$Eleft[left(int_0^T|L_s|^2, dsright)^pright] <infty, quad forall pge 2; $$ and that $(K, L)$ with $L:=(L^1, cdots, L^d)$ is a solution to the associated backward stochastic Riccati equation (BSRE), whose generator is highly nonlinear in the unknown pair of processes. The uniqueness is also proved via a localized completion of squares in a self-contained manner for a general BSRE. The existence and uniqueness of adapted solution to a general BSRE was initially proposed by the French mathematician J. M. Bismut (1976, 1978). It had been solved by the author (2003) via the stochastic maximum principle with a viewpoint of stochastic flow for the associated stochastic Hamiltonian system. The present paper is its companion, and gives the {it second but more comprehensive} adapted solution to a general BSRE via the DDP. Further extensions to the jump-diffusion control system and to the general nonlinear control system are possible.



قيم البحث

اقرأ أيضاً

95 - Jingrui Sun , Zhen Wu , Jie Xiong 2021
This paper is concerned with a backward stochastic linear-quadratic (LQ, for short) optimal control problem with deterministic coefficients. The weighting matrices are allowed to be indefinite, and cross-product terms in the control and state process es are present in the cost functional. Based on a Hilbert space method, necessary and sufficient conditions are derived for the solvability of the problem, and a general approach for constructing optimal controls is developed. The crucial step in this construction is to establish the solvability of a Riccati-type equation, which is accomplished under a fairly weak condition by investigating the connection with forward stochastic LQ optimal control problems.
100 - Na Li , Xun Li , Jing Peng 2020
This paper applies a reinforcement learning (RL) method to solve infinite horizon continuous-time stochastic linear quadratic problems, where drift and diffusion terms in the dynamics may depend on both the state and control. Based on Bellmans dynami c programming principle, an online RL algorithm is presented to attain the optimal control with just partial system information. This algorithm directly computes the optimal control rather than estimating the system coefficients and solving the related Riccati equation. It just requires local trajectory information, greatly simplifying the calculation processing. Two numerical examples are carried out to shed light on our theoretical findings.
The linear-quadratic regulator (LQR) is an efficient control method for linear and linearized systems. Typically, LQR is implemented in minimal coordinates (also called generalized or joint coordinates). However, other coordinates are possible and re cent research suggests that there may be numerical and control-theoretic advantages when using higher-dimensional non-minimal state parameterizations for dynamical systems. One such parameterization is maximal coordinates, in which each link in a multi-body system is parameterized by its full six degrees of freedom and joints between links are modeled with algebraic constraints. Such constraints can also represent closed kinematic loops or contact with the environment. This paper investigates the difference between minimal- and maximal-coordinate LQR control laws. A case study of applying LQR to a simple pendulum and simulations comparing the basins of attraction and tracking performance of minimal- and maximal-coordinate LQR controllers suggest that maximal-coordinate LQR achieves greater robustness and improved tracking performance compared to minimal-coordinate LQR when applied to nonlinear systems.
105 - Sven Leyffer , Paul Manns 2021
We propose a trust-region method that solves a sequence of linear integer programs to tackle integer optimal control problems regularized with a total variation penalty. The total variation penalty allows us to prove the existence of minimizers of the integer optimal control problem. We introduce a local optimality concept for the problem, which arises from the infinite-dimensional perspective. In the case of a one-dimensional domain of the control function, we prove convergence of the iterates produced by our algorithm to points that satisfy first-order stationarity conditions for local optimality. We demonstrate the theoretical findings on a computational example.
This paper discusses the odds problem, proposed by Bruss in 2000, and its variants. A recurrence relation called a dynamic programming (DP) equation is used to find an optimal stopping policy of the odds problem and its variants. In 2013, Buchbinder, Jain, and Singh proposed a linear programming (LP) formulation for finding an optimal stopping policy of the classical secretary problem, which is a special case of the odds problem. The proposed linear programming problem, which maximizes the probability of a win, differs from the DP equations known for long time periods. This paper shows that an ordinary DP equation is a modification of the dual problem of linear programming including the LP formulation proposed by Buchbinder, Jain, and Singh.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا