A frequency-domain analysis of inexact gradient methods

62 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Oran Gannot

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Oran Gannot

التحسين والتحكم التعلم الآلي التحليل العددي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We study robustness properties of some iterative gradient-based methods for strongly convex functions, as well as for the larger class of functions with sector-bounded gradients, under a relative error model. Proofs of the corresponding convergence rates are based on frequency-domain criteria for the stability of nonlinear systems. Applications are given to inexa

قيم البحث

141 - Lam M. Nguyen , Quoc Tran-Dinh , Dzung T. Phan 2020

In this paper, we provide a unified convergence analysis for a class of shuffling-type gradient methods for solving a well-known finite-sum minimization problem commonly used in machine learning. This algorithm covers various variants such as randomi zed reshuffling, single shuffling, and cyclic/incremental gradient schemes. We consider two different settings: strongly convex and non-convex problems. Our main contribution consists of new non-asymptotic and asymptotic convergence rates for a general class of shuffling-type gradient methods to solve both non-convex and strongly convex problems. While our rate in the non-convex problem is new (i.e. not known yet under standard assumptions), the rate on the strongly convex case matches (up to a constant) the best-known results. However, unlike existing works in this direction, we only use standard assumptions such as smoothness and strong convexity. Finally, we empirically illustrate the effect of learning rates via a non-convex logistic regression and neural network examples.

التحسين والتحكم التعلم الآلي التعلم الالي

Inexact Proximal-Point Penalty Methods for Constrained Non-Convex Optimization

98 - Qihang Lin , Runchao Ma , Yangyang Xu 2019

In this paper, an inexact proximal-point penalty method is studied for constrained optimization problems, where the objective function is non-convex, and the constraint functions can also be non-convex. The proposed method approximately solves a sequ ence of subproblems, each of which is formed by adding to the original objective function a proximal term and quadratic penalty terms associated to the constraint functions. Under a weak-convexity assumption, each subproblem is made strongly convex and can be solved effectively to a required accuracy by an optimal gradient-based method. The computational complexity of the proposed method is analyzed separately for the cases of convex constraint and non-convex constraint. For both cases, the complexity results are established in terms of the number of proximal gradient steps needed to find an $varepsilon$-stationary point. When the constraint functions are convex, we show a complexity result of $tilde O(varepsilon^{-5/2})$ to produce an $varepsilon$-stationary point under the Slaters condition. When the constraint functions are non-convex, the complexity becomes $tilde O(varepsilon^{-3})$ if a non-singularity condition holds on constraints and otherwise $tilde O(varepsilon^{-4})$ if a feasible initial solution is available.

التحسين والتحكم التعقيد الحسابي التحليل العددي

Parallel and distributed asynchronous adaptive stochastic gradient methods

118 - Yangyang Xu , Yibo Xu , Yonggui Yan 2020

Stochastic gradient methods (SGMs) are the predominant approaches to train deep learning models. The adapti

التحسين والتحكم النظم الموزعة والتوازية والحوسبة العنقودية التحليل العددي

Inexact Non-Convex Newton-Type Methods

129 - Zhewei Yao , Peng Xu , Farbod Roosta-Khorasani 2018

For solving large-scale non-convex problems, we propose inexact variants of trust region and adaptive cubic regularization methods, which, to increase efficiency, incorporate various approximations. In particular, in addition to approximate sub-probl em solves, both the Hessian and the gradient are suitably approximated. Using rather mild conditions on such approximations, we show that our proposed inexact methods achieve similar optimal worst-case iteration complexities as the exact counterparts. Our proposed algorithms, and their respective theoretical analysis, do not require knowledge of any unknowable problem-related quantities, and hence are easily implementable in practice. In the context of finite-sum problems, we then explore randomized sub-sampling methods as ways to construct the gradient and Hessian approximations and examine the empirical performance of our algorithms on some real datasets.

التحسين والتحكم

Acceleration Methods

61 - Alexandre dAspremont , Damien Scieur , Adrien Taylor 2021

This monograph covers some recent advances on a range of acceleration techniques frequently used in convex optimization. We first use quadratic optimization problems to introduce two key families of methods, momentum and nested optimization schemes, which coincide in the quadratic case to form the Chebyshev method whose complexity is analyzed using Chebyshev polynomials. We discuss momentum methods in detail, starting with the seminal work of Nesterov (1983) and structure convergence proofs using a few master templates, such as that of emph{optimized gradient methods} which have the key benefit of showing how momentum methods maximize convergence rates. We further cover proximal acceleration techniques, at the heart of the emph{Catalyst} and emph{Accelerated Hybrid Proximal Extragradient} frameworks, using similar algorithmic patterns. Common acceleration techniques directly rely on the knowledge of some regularity parameters of the problem at hand, and we conclude by discussing emph{restart} schemes, a set of simple techniques to reach nearly optimal convergence rates while adapting to unobserved regularity parameters.

التحسين والتحكم التعلم الآلي التحليل العددي