ترغب بنشر مسار تعليمي؟ اضغط هنا

On a Faster $R$-Linear Convergence Rate of the Barzilai-Borwein Method

81   0   0.0 ( 0 )
 نشر من قبل Dawei Li
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

The Barzilai-Borwein (BB) method has demonstrated great empirical success in nonlinear optimization. However, the convergence speed of BB method is not well understood, as the known convergence rate of BB method for quadratic problems is much worse than the steepest descent (SD) method. Therefore, there is a large discrepancy between theory and practice. To shrink this gap, we prove that the BB method converges $R$-linearly at a rate of $1-1/kappa$, where $kappa$ is the condition number, for strongly convex quadratic problems. In addition, an example with the theoretical rate of convergence is constructed, indicating the tightness of our bound.

قيم البحث

اقرأ أيضاً

In this paper, we propose a new non-monotone conjugate gradient method for solving unconstrained nonlinear optimization problems. We first modify the non-monotone line search method by introducing a new trigonometric function to calculate the non-mon otone parameter, which plays an essential role in the algorithms efficiency. Then, we apply a convex combination of the Barzilai-Borwein method for calculating the value of step size in each iteration. Under some suitable assumptions, we prove that the new algorithm has the global convergence property. The efficiency and effectiveness of the proposed method are determined in practice by applying the algorithm to some standard test problems and non-negative matrix factorization problems.
173 - Xiuxian Li , Kuo-Yi Lin , Li Li 2021
Communication has been seen as a significant bottleneck in industrial applications over large-scale networks. To alleviate the communication burden, sign-based optimization algorithms have gained popularity recently in both industrial and academic co mmunities, which is shown to be closely related to adaptive gradient methods, such as Adam. Along this line, this paper investigates faster convergence for a variant of sign-based gradient descent, called scaled signGD, in three cases: 1) the objective function is strongly convex; 2) the objective function is nonconvex but satisfies the Polyak-Lojasiewicz (PL) inequality; 3) the gradient is stochastic, called scaled signGD in this case. For the first two cases, it can be shown that the scaled signGD converges at a linear rate. For case 3), the algorithm is shown to converge linearly to a neighborhood of the optimal value when a constant learning rate is employed, and the algorithm converges at a rate of $O(1/k)$ when using a diminishing learning rate, where $k$ is the iteration number. The results are also extended to the distributed setting by majority vote in a parameter-server framework. Finally, numerical experiments on logistic regression are performed to corroborate the theoretical findings.
In recent years, variance-reducing stochastic methods have shown great practical performance, exhibiting linear convergence rate when other stochastic methods offered a sub-linear rate. However, as datasets grow ever bigger and clusters become widesp read, the need for fast distribution methods is pressing. We propose here a distribution scheme for SAGA which maintains a linear convergence rate, even when communication between nodes is limited.
We propose a new stochastic gradient method for optimizing the sum of a finite set of smooth functions, where the sum is strongly convex. While standard stochastic gradient methods converge at sublinear rates for this problem, the proposed method inc orporates a memory of previous gradient values in order to achieve a linear convergence rate. In a machine learning context, numerical experiments indicate that the new algorithm can dramatically outperform standard algorithms, both in terms of optimizing the training error and reducing the test error quickly.
In this paper, we aim to provide a comprehensive analysis on the linear rate convergence of the alternating direction method of multipliers (ADMM) for solving linearly constrained convex composite optimization problems. Under a certain error bound co ndition, we establish the global linear rate of convergence for a more general semi-proximal ADMM with the dual steplength being restricted to be in the open interval $(0, (1+sqrt{5})/2)$. In our analysis, we assume neither the strong convexity nor the strict complementarity except an error bound condition, which holds automatically for convex composite quadratic programming. This semi-proximal ADMM, which includes the classic ADMM, not only has the advantage to resolve the potentially non-solvability issue of the subproblems in the classic ADMM but also possesses the abilities of handling multi-block convex optimization problems efficiently. We shall use convex composite quadratic programming and quadratic semi-definite programming as important applications to demonstrate the significance of the obtained results. Of its own novelty in second-order variational analysis, a complete characterization is provided on the isolated calmness for the nonlinear convex semi-definite optimization problem in terms of its second order sufficient optimality condition and the strict Robinson constraint qualification for the purpose of proving the linear rate convergence of the semi-proximal ADMM when applied to two- and multi-block convex quadratic semi-definite programming.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا