No Arabic abstract
We consider a two-stage stochastic optimization problem, in which a long-term optimization variable is coupled with a set of short-term optimization variables in both objective and constraint functions. Despite that two-stage stochastic optimization plays a critical role in various engineering and scientific applications, there still lack efficient algorithms, especially when the long-term and short-term variables are coupled in the constraints. To overcome the challenge caused by tightly coupled stochastic constraints, we first establish a two-stage primal-dual decomposition (PDD) method to decompose the two-stage problem into a long-term problem and a family of short-term subproblems. Then we propose a PDD-based stochastic successive convex approximation (PDD-SSCA) algorithmic framework to find KKT solutions for two-stage stochastic optimization problems. At each iteration, PDD-SSCA first runs a short-term sub-algorithm to find stationary points of the short-term subproblems associated with a mini-batch of the state samples. Then it constructs a convex surrogate for the long-term problem based on the deep unrolling of the short-term sub-algorithm and the back propagation method. Finally, the optimal solution of the convex surrogate problem is solved to generate the next iterate. We establish the almost sure convergence of PDD-SSCA and customize the algorithmic framework to solve two important application problems. Simulations show that PDD-SSCA can achieve superior performance over existing solutions.
We consider a generic empirical composition optimization problem, where there are empirical averages present both outside and inside nonlinear loss functions. Such a problem is of interest in various machine learning applications, and cannot be directly solved by standard methods such as stochastic gradient descent. We take a novel approach to solving this problem by reformulating the original minimization objective into an equivalent min-max objective, which brings out all the empirical averages that are originally inside the nonlinear loss functions. We exploit the rich structures of the reformulated problem and develop a stochastic primal-dual algorithm, SVRPDA-I, to solve the problem efficiently. We carry out extensive theoretical analysis of the proposed algorithm, obtaining the convergence rate, the computation complexity and the storage complexity. In particular, the algorithm is shown to converge at a linear rate when the problem is strongly convex. Moreover, we also develop an approximate version of the algorithm, named SVRPDA-II, which further reduces the memory requirement. Finally, we evaluate our proposed algorithms on several real-world benchmarks, and experimental results show that the proposed algorithms significantly outperform existing techniques.
In this paper, we consider multi-stage stochastic optimization problems with convex objectives and conic constraints at each stage. We present a new stochastic first-order method, namely the dynamic stochastic approximation (DSA) algorithm, for solving these types of stochastic optimization problems. We show that DSA can achieve an optimal ${cal O}(1/epsilon^4)$ rate of convergence in terms of the total number of required scenarios when applied to a three-stage stochastic optimization problem. We further show that this rate of convergence can be improved to ${cal O}(1/epsilon^2)$ when the objective function is strongly convex. We also discuss variants of DSA for solving more general multi-stage stochastic optimization problems with the number of stages $T > 3$. The developed DSA algorithms only need to go through the scenario tree once in order to compute an $epsilon$-solution of the multi-stage stochastic optimization problem. As a result, the memory required by DSA only grows linearly with respect to the number of stages. To the best of our knowledge, this is the first time that stochastic approximation type methods are generalized for multi-stage stochastic optimization with $T ge 3$.
This paper investigates the problem of regulating in real time a linear dynamical system to the solution trajectory of a time-varying constrained convex optimization problem. The proposed feedback controller is based on an adaptation of the saddle-flow dynamics, modified to take into account projections on constraint sets and output-feedback from the plant. We derive sufficient conditions on the tunable parameters of the controller (inherently related to the time-scale separation between plant and controller dynamics) to guarantee exponential and input-to-state stability of the closed-loop system. The analysis is tailored to the case of time-varying strongly convex cost functions and polytopic output constraints. The theoretical results are further validated in a ramp metering control problem in a network of traffic highways.
This paper studies the distributed optimization problem where the objective functions might be nondifferentiable and subject to heterogeneous set constraints. Unlike existing subgradient methods, we focus on the case when the exact subgradients of the local objective functions can not be accessed by the agents. To solve this problem, we propose a projected primal-dual dynamics using only the objective functions approximate subgradients. We first prove that the formulated optimization problem can only be solved with an approximate error depending upon the accuracy of the available subgradients. Then, we show the exact solvability of this optimization problem if the accumulated approximation error is not too large. After that, we also give a novel componentwise normalized variant to improve the transient behavior of the convergent sequence. The effectiveness of our algorithms is verified by a numerical example.
We lower bound the complexity of finding $epsilon$-stationary points (with gradient norm at most $epsilon$) using stochastic first-order methods. In a well-studied model where algorithms access smooth, potentially non-convex functions through queries to an unbiased stochastic gradient oracle with bounded variance, we prove that (in the worst case) any algorithm requires at least $epsilon^{-4}$ queries to find an $epsilon$ stationary point. The lower bound is tight, and establishes that stochastic gradient descent is minimax optimal in this model. In a more restrictive model where the noisy gradient estimates satisfy a mean-squared smoothness property, we prove a lower bound of $epsilon^{-3}$ queries, establishing the optimality of recently proposed variance reduction techniques.