No Arabic abstract
Stochastic MPECs have found increasing relevance for modeling a broad range of settings in engineering and statistics. Yet, there seem to be no efficient first/zeroth-order schemes equipped with non-asymptotic rate guarantees for resolving even deterministic variants of such problems. We consider MPECs where the parametrized lower-level equilibrium problem is given by a deterministic/stochastic VI problem whose mapping is strongly monotone, uniformly in upper-level decisions. We develop a zeroth-order implicit algorithmic framework by leveraging a locally randomized spherical smoothing scheme. We make three sets of contributions: (i) Convex settings. When the implicit problem is convex and the lower-level decision is obtainable by inexactly solving a strongly monotone stochastic VI to compute an $epsilon$-solution, we derive iteration complexity guarantees of $mathcal{O}left(tfrac{L_0^2n^2}{epsilon^2}right)$ (upper-level) and $mathcal{O}left(tfrac{L_0^2 n^2}{epsilon^2} ln(tfrac{L_0 n}{epsilon})right)$ (lower-level); (ii) Exact oracles and accelerated schemes. When the lower-level problem can be resolved exactly, employing accelerated schemes, the complexity improves to $mathcal{O}(tfrac{1}{epsilon})$ and $mathcal{O}(tfrac{1}{epsilon^{2+delta}})$, respectively. Notably, this guarantee extends to stochastic MPECs with equilibrium constraints imposed in an almost sure sense; (iii) Nonconvex regimes. When the implicit problem is not necessarily convex and the lower-level problem can be inexactly resolved via a stochastic approximation framework, computing an $epsilon$-stationary point is equipped with complexity bounds of $mathcal{O}left(tfrac{L_0^2n^2}{epsilon}right)$ (upper-level) and $mathcal{O}left(tfrac{L_0^6n^6}{epsilon^3}right)$ (lower-level). We also provide numerical results for validating the theoretical findings in this work.
This paper considers a class of constrained convex stochastic composite optimization problems whose objective function is given by the summation of a differentiable convex component, together with a nonsmooth but convex component. The nonsmooth component has an explicit max structure that may not easy to compute its proximal mapping. In order to solve these problems, we propose a mini-batch stochastic Nesterovs smoothing (MSNS) method. Convergence and the optimal iteration complexity of the method are established. Numerical results are provided to illustrate the efficiency of the proposed MSNS method for a support vector machine (SVM) model.
It has been widely recognized that the 0/1 loss function is one of the most natural choices for modelling classification errors, and it has a wide range of applications including support vector machines and 1-bit compressed sensing. Due to the combinatorial nature of the 0/1 loss function, methods based on convex relaxations or smoothing approximations have dominated the existing research and are often able to provide approximate solutions of good quality. However, those methods are not optimizing the 0/1 loss function directly and hence no optimality has been established for the original problem. This paper aims to study the optimality conditions of the 0/1 function minimization, and for the first time to develop Newtons method that directly optimizes the 0/1 function with a local quadratic convergence under reasonable conditions. Extensive numerical experiments demonstrate its superior performance as one would expect from Newton-type methods.ions. Extensive numerical experiments demonstrate its superior performance as one would expect from Newton-type methods.
We examine popular gradient-based algorithms for nonlinear control in the light of the modern complexity analysis of first-order optimization algorithms. The examination reveals that the complexity bounds can be clearly stated in terms of calls to a computational oracle related to dynamic programming and implementable by gradient back-propagation using machine learning software libraries such as PyTorch or TensorFlow. Finally, we propose a regularized Gauss-Newton algorithm enjoying worst-case complexity bounds and improved convergence behavior in practice. The software library based on PyTorch is publicly available.
Convex composition optimization is an emerging topic that covers a wide range of applications arising from stochastic optimal control, reinforcement learning and multi-stage stochastic programming. Existing algorithms suffer from unsatisfactory sample complexity and practical issues since they ignore the convexity structure in the algorithmic design. In this paper, we develop a new stochastic compositional variance-reduced gradient algorithm with the sample complexity of $O((m+n)log(1/epsilon)+1/epsilon^3)$ where $m+n$ is the total number of samples. Our algorithm is near-optimal as the dependence on $m+n$ is optimal up to a logarithmic factor. Experimental results on real-world datasets demonstrate the effectiveness and efficiency of the new algorithm.
This paper considers the problem of minimizing a convex expectation function over a closed convex set, coupled with a set of inequality convex expectation constraints. We present a new stochastic approximation type algorithm, namely the stochastic approximation proximal method of multipliers (PMMSopt) to solve this convex stochastic optimization problem. We analyze regrets of a stochastic approximation proximal method of multipliers for solving convex stochastic optimization problems. Under mild conditions, we show that this algorithm exhibits ${rm O}(T^{-1/2})$ rate of convergence, in terms of both optimality gap and constraint violation if parameters in the algorithm are properly chosen, when the objective and constraint functions are generally convex, where $T$ denotes the number of iterations. Moreover, we show that, with at least $1-e^{-T^{1/4}}$ probability, the algorithm has no more than ${rm O}(T^{-1/4})$ objective regret and no more than ${rm O}(T^{-1/8})$ constraint violation regret. To the best of our knowledge, this is the first time that such a proximal method for solving expectation constrained stochastic optimization is presented in the literature.