ترغب بنشر مسار تعليمي؟ اضغط هنا

Perturbation techniques for convergence analysis of proximal gradient method and other first-order algorithms via variational analysis

329   0   0.0 ( 0 )
 نشر من قبل Jin Zhang
 تاريخ النشر 2018
  مجال البحث
والبحث باللغة English




اسأل ChatGPT حول البحث

We develop new perturbation techniques for conducting convergence analysis of various first-order algorithms for a class of nonsmooth optimization problems. We consider the iteration scheme of an algorithm to construct a perturbed stationary point set-valued map, and define the perturbing parameter by the difference of two consecutive iterates. Then, we show that the calmness condition of the induced set-valued map, together with a local version of the proper separation of stationary value condition, is a sufficient condition to ensure the linear convergence of the algorithm. The equivalence of the calmness condition to the one for the canonically perturbed stationary point set-valued map is proved, and this equivalence allows us to derive some sufficient conditions for calmness by using some recent developments in variational analysis. These sufficient conditions are different from existing results (especially, those error-bound-based ones) in that they can be easily verified for many concrete application models. Our analysis is focused on the fundamental proximal gradient (PG) method, and it enables us to show that any accumulation of the sequence generated by the PG method must be a stationary point in terms of the proximal subdifferential, instead of the limiting subdifferential. This result finds the surprising fact that the solution quality found by the PG method is in general superior. Our analysis also leads to some improvement for the linear convergence results of the PG method in the convex case. The new perturbation technique can be conveniently used to derive linear rate convergence of a number of other first-order methods including the well-known alternating direction method of multipliers and primal-dual hybrid gradient method, under mild assumptions.



قيم البحث

اقرأ أيضاً

Many problems in science and engineering involve, as part of their solution process, the consideration of a separable function which is the sum of two convex functions, one of them possibly non-smooth. Recently a few works have discussed inexa
This work studies a class of non-smooth decentralized multi-agent optimization problems where the agents aim at minimizing a sum of local strongly-convex smooth components plus a common non-smooth term. We propose a general primal-dual algorithmic fr amework that unifies many existing state-of-the-art algorithms. We establish linear convergence of the proposed method to the exact solution in the presence of the non-smooth term. Moreover, for the more general class of problems with agent specific non-smooth terms, we show that linear convergence cannot be achieved (in the worst case) for the class of algorithms that uses the gradients and the proximal mappings of the smooth and non-smooth parts, respectively. We further provide a numerical counterexample that shows how some state-of-the-art algorithms fail to converge linearly for strongly-convex objectives and different local non-smooth terms.
In this paper, we propose two new solution schemes to solve the stochastic strongly monotone variational inequality problems: the stochastic extra-point solution scheme and the stochastic extra-momentum solution scheme. The first one is a general sch eme based on updating the iterative sequence and an auxiliary extra-point sequence. In the case of deterministic VI model, this approach includes several state-of-the-art first-order methods as its special cases. The second scheme combines two momentum-based directions: the so-called heavy-ball direction and the optimism direction, where only one projection per iteration is required in its updating process. We show that, if the variance of the stochastic oracle is appropriately controlled, then both schemes can be made to achieve optimal iteration complexity of $mathcal{O}left(kappalnleft(frac{1}{epsilon}right)right)$ to reach an $epsilon$-solution for a strongly monotone VI problem with condition number $kappa$. We show that these methods can be readily incorporated in a zeroth-order approach to solve stochastic minimax saddle-point problems, where only noisy and biased samples of the objective can be obtained, with a total sample complexity of $mathcal{O}left(frac{kappa^2}{epsilon}lnleft(frac{1}{epsilon}right)right)$
313 - Long Chen , Hao Luo 2021
We present a unified convergence analysis for first order convex optimization methods using the concept of strong Lyapunov conditions. Combining this with suitable time scaling factors, we are able to handle both convex and strong convex cases, and e stablish contraction properties of Lyapunov functions for many existing ordinary differential equation models. Then we derive prevailing first order optimization algorithms, such as proximal gradient methods, heavy ball methods (also known as momentum methods), Nesterov accelerated gradient methods, and accelerated proximal gradient methods from numerical discretizations of corresponding dynamical systems. We also apply strong Lyapunov conditions to the discrete level and provide a more systematical analysis framework. Another contribution is a novel second order dynamical system called Hessian-driven Nesterov accelerated gradient flow which can be used to design and analyze accelerated first order methods for smooth and non-smooth convex optimizations.
This paper investigates the stochastic distributed nonconvex optimization problem of minimizing a global cost function formed by the summation of $n$ local cost functions. We solve such a problem by involving zeroth-order (ZO) information exchange. I n this paper, we propose a ZO distributed primal-dual coordinate method (ZODIAC) to solve the stochastic optimization problem. Agents approximate their own local stochastic ZO oracle along with coordinates with an adaptive smoothing parameter. We show that the proposed algorithm achieves the convergence rate of $mathcal{O}(sqrt{p}/sqrt{T})$ for general nonconvex cost functions. We demonstrate the efficiency of proposed algorithms through a numerical example in comparison with the existing state-of-the-art centralized and distributed ZO algorithms.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا