Linear Convergent Decentralized Optimization with Compression

101 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Xiaorui Liu

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Xiaorui Liu - Yao Li - Rongrong Wang

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Communication compression has become a key strategy to speed up distributed optimization. However, existing decentralized algorithms with compression mainly focus on compressing DGD-type algorithms. They are unsatisfactory in terms of convergence rate, stability, and the capability to handle heterogeneous data. Motivated by primal-dual algorithms, this paper proposes the first underline{L}inunderline{EA}r convergent underline{D}ecentralized algorithm with compression, LEAD. Our theory describes the coupled dynamics of the inexact primal and dual update as well as compression error, and we provide the first consensus error bound in such settings without assuming bounded gradients. Experiments on convex problems validate our theoretical analysis, and empirical study on deep neural nets shows that LEAD is applicable to non-convex problems.

قيم البحث

97 - Yao Li , Xiaorui Liu , Jiliang Tang 2021

Decentralized optimization and communication compression have exhibited their great potential in accelerating distributed machine learning by mitigating the communication bottleneck in practice. While existing decentralized algorithms with communicat ion compression mostly focus on the problems with only smooth components, we study the decentralized stochastic composite optimization problem with a potentially non-smooth component. A underline{Prox}imal gradient underline{L}inunderline{EA}r convergent underline{D}ecentralized algorithm with compression, Prox-LEAD, is proposed with rigorous theoretical analyses in the general stochastic setting and the finite-sum setting. Our theorems indicate that Prox-LEAD works with arbitrary compression precision, and it tremendously reduces the communication cost almost for free. The superiorities of the proposed algorithms are demonstrated through the comparison with state-of-the-art algorithms in terms of convergence complexities and numerical experiments. Our algorithmic framework also generally enlightens the compressed communication on other primal-dual algorithms by reducing the impact of inexact iterations, which might be of independent interest.

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية التحسين والتحكم

A Linearly Convergent Proximal Gradient Algorithm for Decentralized Optimization

84 - Sulaiman A. Alghunaim , Kun Yuan , Ali H. Sayed 2019

Decentralized optimization is a powerful paradigm that finds applications in engineering and learning design. This work studies decentralized composite optimization problems with non-smooth regularization terms. Most existing gradient-based proximal decentralized methods are known to converge to the optimal solution with sublinear rates, and it remains unclear whether this family of methods can achieve global linear convergence. To tackle this problem, this work assumes the non-smooth regularization term is common across all networked agents, which is the case for many machine learning problems. Under this condition, we design a proximal gradient decentralized algorithm whose fixed point coincides with the desired minimizer. We then provide a concise proof that establishes its linear convergence. In the absence of the non-smooth term, our analysis technique covers the well known EXTRA algorithm and provides useful bounds on the convergence rate and step-size.

التحسين والتحكم النظم الموزعة والتوازية والحوسبة العنقودية

Compressed Gradient Tracking Methods for Decentralized Optimization with Linear Convergence

184 - Yiwei Liao , Zhuorui Li , Kun Huang 2021

Communication compression techniques are of growing interests for solving the decentralized optimization problem under limited communication, where the global objective is to minimize the average of local cost functions over a multi-agent network usi ng only local computation and peer-to-peer communication. In this paper, we first propose a novel compressed gradient tracking algorithm (C-GT) that combines gradient tracking technique with communication compression. In particular, C-GT is compatible with a general class of compression operators that unifies both unbiased and biased compressors. We show that C-GT inherits the advantages of gradient tracking-based algorithms and achieves linear convergence rate for strongly convex and smooth objective functions. In the second part of this paper, we propose an error feedback based compressed gradient tracking algorithm (EF-C-GT) to further improve the algorithm efficiency for biased compression operators. Numerical examples complement the theoretical findings and demonstrate the efficiency and flexibility of the proposed algorithms.

التحسين والتحكم النظم الموزعة والتوازية والحوسبة العنقودية أنظمة متعددة العملاء

Innovation Compression for Communication-efficient Distributed Optimization with Linear Convergence

192 - Jiaqi Zhang , Keyou You , Lihua Xie 2021

Information compression is essential to reduce communication cost in distributed optimization over peer-to-peer networks. This paper proposes a communication-efficient linearly convergent distributed (COLD) algorithm to solve strongly convex optimiza tion problems. By compressing innovation vectors, which are the differences between decision vectors and their estimates, COLD is able to achieve linear convergence for a class of $delta$-contracted compressors. We explicitly quantify how the compression affects the convergence rate and show that COLD matches the same rate of its uncompressed version. To accommodate a wider class of compressors that includes the binary quantizer, we further design a novel dynamical scaling mechanism and obtain the linearly convergent Dyna-COLD. Importantly, our results strictly improve existing results for the quantized consensus problem. Numerical experiments demonstrate the advantages of both algorithms under different compressors.

التحسين والتحكم النظم الموزعة والتوازية والحوسبة العنقودية التعلم الآلي

Tight Linear Convergence Rate of ADMM for Decentralized Optimization

137 - Meng Ma , Bingcong Li , Georgios B. Giannakis 2019

The present paper considers leveraging network topology information to improve the convergence rate of ADMM for decentralized optimization, where networked nodes work collaboratively to minimize the objective. Such problems can be solved efficiently using ADMM via decomposing the objective into easier subproblems. Properly exploiting network topology can significantly improve the algorithm performance. Hybrid ADMM explores the direction of exploiting node information by taking into account node centrality but fails to utilize edge information. This paper fills the gap by incorporating both node and edge information and provides a novel convergence rate bound for decentralized ADMM that explicitly depends on network topology. Such a novel bound is attainable for certain class of problems, thus tight. The explicit dependence further suggests possible directions to optimal design of edge weights to achieve the best performance. Numerical experiments show that simple heuristic methods could achieve better performance, and also exhibits robustness to topology changes.

التحسين والتحكم النظم الموزعة والتوازية والحوسبة العنقودية معالجة الإشارات