ترغب بنشر مسار تعليمي؟ اضغط هنا

Decentralized Proximal Gradient Algorithms with Linear Convergence Rates

131   0   0.0 ( 0 )
 نشر من قبل Sulaiman Alghunaim
 تاريخ النشر 2019
  مجال البحث
والبحث باللغة English




اسأل ChatGPT حول البحث

This work studies a class of non-smooth decentralized multi-agent optimization problems where the agents aim at minimizing a sum of local strongly-convex smooth components plus a common non-smooth term. We propose a general primal-dual algorithmic framework that unifies many existing state-of-the-art algorithms. We establish linear convergence of the proposed method to the exact solution in the presence of the non-smooth term. Moreover, for the more general class of problems with agent specific non-smooth terms, we show that linear convergence cannot be achieved (in the worst case) for the class of algorithms that uses the gradients and the proximal mappings of the smooth and non-smooth parts, respectively. We further provide a numerical counterexample that shows how some state-of-the-art algorithms fail to converge linearly for strongly-convex objectives and different local non-smooth terms.

قيم البحث

اقرأ أيضاً

Communication compression techniques are of growing interests for solving the decentralized optimization problem under limited communication, where the global objective is to minimize the average of local cost functions over a multi-agent network usi ng only local computation and peer-to-peer communication. In this paper, we first propose a novel compressed gradient tracking algorithm (C-GT) that combines gradient tracking technique with communication compression. In particular, C-GT is compatible with a general class of compression operators that unifies both unbiased and biased compressors. We show that C-GT inherits the advantages of gradient tracking-based algorithms and achieves linear convergence rate for strongly convex and smooth objective functions. In the second part of this paper, we propose an error feedback based compressed gradient tracking algorithm (EF-C-GT) to further improve the algorithm efficiency for biased compression operators. Numerical examples complement the theoretical findings and demonstrate the efficiency and flexibility of the proposed algorithms.
Decentralized optimization is a powerful paradigm that finds applications in engineering and learning design. This work studies decentralized composite optimization problems with non-smooth regularization terms. Most existing gradient-based proximal decentralized methods are known to converge to the optimal solution with sublinear rates, and it remains unclear whether this family of methods can achieve global linear convergence. To tackle this problem, this work assumes the non-smooth regularization term is common across all networked agents, which is the case for many machine learning problems. Under this condition, we design a proximal gradient decentralized algorithm whose fixed point coincides with the desired minimizer. We then provide a concise proof that establishes its linear convergence. In the absence of the non-smooth term, our analysis technique covers the well known EXTRA algorithm and provides useful bounds on the convergence rate and step-size.
We study convergence rates of the classic proximal bundle method for a variety of nonsmooth convex optimization problems. We show that, without any modification, this algorithm adapts to converge faster in the presence of smoothness or a Holder growt h condition. Our analysis reveals that with a constant stepsize, the bundle method is adaptive, yet it exhibits suboptimal convergence rates. We overcome this shortcoming by proposing nonconstant stepsize schemes with optimal rates. These schemes use function information such as growth constants, which might be prohibitive in practice. We complete the paper with a new parallelizable variant of the bundle method that attains near-optimal rates without prior knowledge of function parameters. These results improve on the limited existing convergence rates and provide a unified analysis approach across problem settings and algorithmic details. Numerical experiments support our findings and illustrate the effectiveness of the parallel bundle method.
We consider learning an undirected graphical model from sparse data. While several efficient algorithms have been proposed for graphical lasso (GL), the alternating direction method of multipliers (ADMM) is the main approach taken concerning for join t graphical lasso (JGL). We propose proximal gradient procedures with and without a backtracking option for the JGL. These procedures are first-order and relatively simple, and the subproblems are solved efficiently in closed form. We further show the boundedness for the solution of the JGL problem and the iterations in the algorithms. The numerical results indicate that the proposed algorithms can achieve high accuracy and precision, and their efficiency is competitive with state-of-the-art algorithms.
In this paper, we consider minimizing a sum of local convex objective functions in a distributed setting, where the cost of communication and/or computation can be expensive. We extend and generalize the analysis for a class of nested gradient-based distributed algorithms (NEAR-DGD; Berahas, Bollapragada, Keskar and Wei, 2018) to account for multiple gradient steps at every iteration. We show the effect of performing multiple gradient steps on the rate of convergence and on the size of the neighborhood of convergence, and prove R-Linear convergence to the exact solution with a fixed number of gradient steps and increasing number of consensus steps. We test the performance of the generalized method on quadratic functions and show the effect of multiple consensus and gradient steps in terms of iterations, number of gradient evaluations, number of communications and cost.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا