New First-Order Algorithms for Stochastic Variational Inequalities

127 0 0.0 ( 0 )

Download Cite

Added by Kevin Huang

Publication date 2021

fields

and research's language is English

Authors Kevin Huang - Shuzhong Zhang

Optimization and Control

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In this paper, we propose two new solution schemes to solve the stochastic strongly monotone variational inequality problems: the stochastic extra-point solution scheme and the stochastic extra-momentum solution scheme. The first one is a general scheme based on updating the iterative sequence and an auxiliary extra-point sequence. In the case of deterministic VI model, this approach includes several state-of-the-art first-order methods as its special cases. The second scheme combines two momentum-based directions: the so-called heavy-ball direction and the optimism direction, where only one projection per iteration is required in its updating process. We show that, if the variance of the stochastic oracle is appropriately controlled, then both schemes can be made to achieve optimal iteration complexity of $mathcal{O}left(kappalnleft(frac{1}{epsilon}right)right)$ to reach an $epsilon$-solution for a strongly monotone VI problem with condition number $kappa$. We show that these methods can be readily incorporated in a zeroth-order approach to solve stochastic minimax saddle-point problems, where only noisy and biased samples of the objective can be obtained, with a total sample complexity of $mathcal{O}left(frac{kappa^2}{epsilon}lnleft(frac{1}{epsilon}right)right)$

rate research

A Unifying Framework of Accelerated First-Order Approach to Strongly Monotone Variational Inequalities

90 - Kevin Huang , Shuzhong Zhang 2021

In this paper, we propose a unifying framework incorporating several momentum-related search directions for solving strongly monotone variational inequalities. The specific combinations of the search directions in the framework are made to guarantee the optimal iteration complexity bound of $mathcal{O}left(kappaln(1/epsilon)right)$ to reach an $epsilon$-solution, where $kappa$ is the condition number. This framework provides the flexibility for algorithm designers to train -- among different parameter combinations -- the one that best suits the structure of the problem class at hand. The proposed framework includes the following iterative points and directions as its constituents: the extra-gradient, the optimistic gradient descent ascent (OGDA) direction (aka optimism), the heavy-ball direction, and Nesterovs extrapolation points. As a result, all the afore-mentioned methods become the special cases under the general scheme of extra points. We also specialize this approach to strongly convex minimization, and show that a similar extra-point approach achieves the optimal iteration complexity bound of $mathcal{O}(sqrt{kappa}ln(1/epsilon))$ for this class of problems.

Optimization and Control

Zeroth-Order Algorithms for Stochastic Distributed Nonconvex Optimization

141 - Xinlei Yi , Shengjun Zhang , Tao Yang 2021

In this paper, we consider a stochastic distributed nonconvex optimization problem with the cost function being distributed over $n$ agents having access only to zeroth-order (ZO) information of the cost. This problem has various machine learning applications. As a solution, we propose two distributed ZO algorithms, in which at each iteration each agent samples the local stochastic ZO oracle at two points with an adaptive smoothing parameter. We show that the proposed algorithms achieve the linear speedup convergence rate $mathcal{O}(sqrt{p/(nT)})$ for smooth cost functions and $mathcal{O}(p/(nT))$ convergence rate when the global cost function additionally satisfies the Polyak--Lojasiewicz (P--L) condition, where $p$ and $T$ are the dimension of the decision variable and the total number of iterations, respectively. To the best of our knowledge, this is the first linear speedup result for distributed ZO algorithms, which enables systematic processing performance improvements by adding more agents. We also show that the proposed algorithms converge linearly when considering deterministic centralized optimization problems under the P--L condition. We demonstrate through numerical experiments the efficiency of our algorithms on generating adversarial examples from deep neural networks in comparison with baseline and recently proposed centralized and distributed ZO algorithms.

Optimization and Control

Perturbation techniques for convergence analysis of proximal gradient method and other first-order algorithms via variational analysis

328 - Xiangfeng Wang , Jane Ye , Xiaoming Yuan 2018

We develop new perturbation techniques for conducting convergence analysis of various first-order algorithms for a class of nonsmooth optimization problems. We consider the iteration scheme of an algorithm to construct a perturbed stationary point set-valued map, and define the perturbing parameter by the difference of two consecutive iterates. Then, we show that the calmness condition of the induced set-valued map, together with a local version of the proper separation of stationary value condition, is a sufficient condition to ensure the linear convergence of the algorithm. The equivalence of the calmness condition to the one for the canonically perturbed stationary point set-valued map is proved, and this equivalence allows us to derive some sufficient conditions for calmness by using some recent developments in variational analysis. These sufficient conditions are different from existing results (especially, those error-bound-based ones) in that they can be easily verified for many concrete application models. Our analysis is focused on the fundamental proximal gradient (PG) method, and it enables us to show that any accumulation of the sequence generated by the PG method must be a stationary point in terms of the proximal subdifferential, instead of the limiting subdifferential. This result finds the surprising fact that the solution quality found by the PG method is in general superior. Our analysis also leads to some improvement for the linear convergence results of the PG method in the convex case. The new perturbation technique can be conveniently used to derive linear rate convergence of a number of other first-order methods including the well-known alternating direction method of multipliers and primal-dual hybrid gradient method, under mild assumptions.

Optimization and Control

Zeroth-order Stochastic Compositional Algorithms for Risk-Aware Learning

114 - Dionysios S. Kalogerias , Warren B. Powell 2019

We present Free-MESSAGEp, the first zeroth-order algorithm for convex mean-semideviation-based risk-aware learning, which is also the first three-level zeroth-order compositional stochastic optimization algorithm, whatsoever. Using a non-trivial extension of Nesterovs classical results on Gaussian smoothing, we develop the Free-MESSAGEp algorithm from first principles, and show that it essentially solves a smoothed surrogate to the original problem, the former being a uniform approximation of the latter, in a useful, convenient sense. We then present a complete analysis of the Free-MESSAGEp algorithm, which establishes convergence in a user-tunable neighborhood of the optimal solutions of the original problem, as well as explicit convergence rates for both convex and strongly convex costs. Orderwise, and for fixed problem parameters, our results demonstrate no sacrifice in convergence speed compared to existing first-order methods, while striking a certain balance among the condition of the problem, its dimensionality, as well as the accuracy of the obtained results, naturally extending previous results in zeroth-order risk-neutral learning.

Optimization and Control Machine Learning Systems and Control

Linearly Convergent First-Order Algorithms for Semi-definite Programming

336 - Cong D. Dang , Guanghui Lan 2013

In this paper, we consider two formulations for Linear Matrix Inequalities (LMIs) under Slater type constraint qualification assumption, namely, SDP smooth and non-smooth formulations. We also propose two first-order linearly convergent algorithms for solving these formulations. Moreover, we introduce a bundle-level method which converges linearly uniformly for both smooth and non-smooth problems and does not require any smoothness information. The convergence properties of these algorithms are also discussed. Finally, we consider a special case of LMIs, linear system of inequalities, and show that a linearly convergent algorithm can be obtained under a weaker assumption.

Optimization and Control