بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Convergence of block coordinate descent with diminishing radius for nonconvex optimization

150 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Hanbaek Lyu

تاريخ النشر 2020

مجال البحث الاحصاء الرياضي

والبحث باللغة English

تأليف Hanbaek Lyu

التحسين والتحكم التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Block coordinate descent (BCD), also known as nonlinear Gauss-Seidel, is a simple iterative algorithm for nonconvex optimization that sequentially minimizes the objective function in each block coordinate while the other coordinates are held fixed. We propose a version of BCD that is guaranteed to converge to the stationary points of block-wise convex and differentiable objective functions under constraints. Furthermore, we obtain a best-case rate of convergence of order $log n/sqrt{n}$, where $n$ denotes the number of iterations. A key idea is to restrict the parameter search within a diminishing radius to promote stability of iterates, and then to show that such auxiliary constraints vanish in the limit. As an application, we provide a modified alternating least squares algorithm for nonnegative CP tensor factorization that converges to the stationary points of the reconstruction error with the same bound on the best-case rate of convergence. We also experimentally validate our results with both synthetic and real-world data.

قيم البحث

اقرأ أيضاً

Markov Chain Block Coordinate Descent

477 - Tao Sun , Yuejiao Sun , Yangyang Xu 2018

The method of block coordinate gradient descent (BCD) has been a powerful method for large-scale optimization. This paper considers the BCD method that successively updates a series of blocks selected according to a Markov chain. This kind of block s election is neither i.i.d. random nor cyclic. On the other hand, it is a natural choice for some applications in distributed optimization and Markov decision process, where i.i.d. random and cyclic selections are either infeasible or very expensive. By applying mixing-time properties of a Markov chain, we prove convergence of Markov chain BCD for minimizing Lipschitz differentiable functions, which can be nonconvex. When the functions are convex and strongly convex, we establish both sublinear and linear convergence rates, respectively. We also present a method of Markov chain inertial BCD. Finally, we discuss potential applications.

التحسين والتحكم التعلم الآلي التعلم الالي

On the Convergence Rate of Stochastic Mirror Descent for Nonsmooth Nonconvex Optimization

159 - Siqi Zhang , Niao He 2018

In this paper, we investigate the non-asymptotic stationary convergence behavior of Stochastic Mirror Descent (SMD) for nonconvex optimization. We focus on a general class of nonconvex nonsmooth stochastic optimization problems, in which the objectiv e can be decomposed into a relatively weakly convex function (possibly non-Lipschitz) and a simple non-smooth convex regularizer. We prove that SMD, without the use of mini-batch, is guaranteed to converge to a stationary point in a convergence rate of $ mathcal{O}(1/sqrt{t}) $. The efficiency estimate matches with existing results for stochastic subgradient method, but is evaluated under a stronger stationarity measure. Our convergence analysis applies to both the original SMD and its proximal version, as well as the deterministic variants, for solving relatively weakly convex problems.

التحسين والتحكم

A Zeroth-Order Block Coordinate Descent Algorithm for Huge-Scale Black-Box Optimization

86 - HanQin Cai , Yuchen Lou , Daniel McKenzie 2021

We consider the zeroth-order optimization problem in the huge-scale setting, where the dimension of the problem is so large that performing even basic vector operations on the decision variables is infeasible. In this paper, we propose a novel algori thm, coined ZO-BCD, that exhibits favorable overall query complexity and has a much smaller per-iteration computational complexity. In addition, we discuss how the memory footprint of ZO-BCD can be reduced even further by the clever use of circulant measurement matrices. As an application of our new method, we propose the idea of crafting adversarial attacks on neural network based classifiers in a wavelet domain, which can result in problem dimensions of over 1.7 million. In particular, we show that crafting adversarial examples to audio classifiers in a wavelet domain can achieve the state-of-the-art attack success rate of 97.9%.

التحسين والتحكم الذكاء الاصطناعي التعلم الآلي

On the global convergence of randomized coordinate gradient descent for non-convex optimization

269 - Ziang Chen , Yingzhou Li , Jianfeng Lu 2021

In this work, we analyze the global convergence property of coordinate gradient descent with random choice of coordinates and stepsizes for non-convex optimization problems. Under generic assumptions, we prove that the algorithm iterate will almost s urely escape strict saddle points of the objective function. As a result, the algorithm is guaranteed to converge to local minima if all saddle points are strict. Our proof is based on viewing coordinate descent algorithm as a nonlinear random dynamical system and a quantitative finite block analysis of its linearization around saddle points.

التحسين والتحكم التحليل العددي النظم الديناميكية

Convergence Analysis of Nonconvex Distributed Stochastic Zeroth-order Coordinate Method

118 - Shengjun Zhang , Yunlong Dong , Dong Xie 2021

This paper investigates the stochastic distributed nonconvex optimization problem of minimizing a global cost function formed by the summation of $n$ local cost functions. We solve such a problem by involving zeroth-order (ZO) information exchange. I n this paper, we propose a ZO distributed primal-dual coordinate method (ZODIAC) to solve the stochastic optimization problem. Agents approximate their own local stochastic ZO oracle along with coordinates with an adaptive smoothing parameter. We show that the proposed algorithm achieves the convergence rate of $mathcal{O}(sqrt{p}/sqrt{T})$ for general nonconvex cost functions. We demonstrate the efficiency of proposed algorithms through a numerical example in comparison with the existing state-of-the-art centralized and distributed ZO algorithms.

التحسين والتحكم التعلم الآلي أنظمة وتحكم

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الشام الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Convergence of block coordinate descent with diminishing radius for nonconvex optimization

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً