ﻻ يوجد ملخص باللغة العربية
While the techniques in optimal control theory are often model-based, the policy optimization (PO) approach can directly optimize the performance metric of interest without explicit dynamical models, and is an essential approach for reinforcement learning problems. However, it usually leads to a non-convex optimization problem in most cases, where there is little theoretical understanding on its performance. In this paper, we focus on the risk-constrained Linear Quadratic Regulator (LQR) problem with noisy input via the PO approach, which results in a challenging non-convex problem. To this end, we first build on our earlier result that the optimal policy has an affine structure to show that the associated Lagrangian function is locally gradient dominated with respect to the policy, based on which we establish strong duality. Then, we design policy gradient primal-dual methods with global convergence guarantees to find an optimal policy-multiplier pair in both model-based and sample-based settings. Finally, we use samples of system trajectories in simulations to validate our policy gradient primal-dual methods.
In this work, we revisit a classical incremental implementation of the primal-descent dual-ascent gradient method used for the solution of equality constrained optimization problems. We provide a short proof that establishes the linear (exponential)
This paper studies the distributed optimization problem where the objective functions might be nondifferentiable and subject to heterogeneous set constraints. Unlike existing subgradient methods, we focus on the case when the exact subgradients of th
Stochastic gradient methods (SGMs) have been widely used for solving stochastic optimization problems. A majority of existing works assume no constraints or easy-to-project constraints. In this paper, we consider convex stochastic optimization proble
The spectral bundle method proposed by Helmberg and Rendl is well established for solving large scale semidefinite programs (SDP) thanks to its low per iteration computational complexity and strong practical performance. In this paper, we revisit thi
Small-scale Mixed-Integer Quadratic Programming (MIQP) problems often arise in embedded control and estimation applications. Driven by the need for algorithmic simplicity to target computing platforms with limited memory and computing resources, this