Accelerated Primal-Dual Algorithm for Distributed Non-convex Optimization

147 0 0.0 ( 0 )

Download Cite

Added by Shengjun Zhang

Publication date 2021

fields

and research's language is English

Authors Shengjun Zhang - Colleen P. Bailey

Optimization and Control

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper investigates accelerating the convergence of distributed optimization algorithms on non-convex problems. We propose a distributed primal-dual stochastic gradient descent~(SGD) equipped with powerball method to accelerate. We show that the proposed algorithm achieves the linear speedup convergence rate $mathcal{O}(1/sqrt{nT})$ for general smooth (possibly non-convex) cost functions. We demonstrate the efficiency of the algorithm through numerical experiments by training two-layer fully connected neural networks and convolutional neural networks on the MNIST dataset to compare with state-of-the-art distributed SGD algorithms and centralized SGD algorithms.

rate research

A Chebyshev-Accelerated Primal-Dual Method for Distributed Optimization

162 - Jacob H. Seidman , Mahyar Fazlyab , George J. Pappas 2018

We consider a distributed optimization problem over a network of agents aiming to minimize a global objective function that is the sum of local convex and composite cost functions. To this end, we propose a distributed Chebyshev-accelerated primal-dual algorithm to achieve faster ergodic convergence rates. In standard distributed primal-dual algorithms, the speed of convergence towards a global optimum (i.e., a saddle point in the corresponding Lagrangian function) is directly influenced by the eigenvalues of the Laplacian matrix representing the communication graph. In this paper, we use Chebyshev matrix polynomials to generate gossip matrices whose spectral properties result in faster convergence speeds, while allowing for a fully distributed implementation. As a result, the proposed algorithm requires fewer gradient updates at the cost of additional rounds of communications between agents. We illustrate the performance of the proposed algorithm in a distributed signal recovery problem. Our simulations show how the use of Chebyshev matrix polynomials can be used to improve the convergence speed of a primal-dual algorithm over communication networks, especially in networks with poor spectral properties, by trading local computation by communication rounds.

Optimization and Control

A Decentralized Primal-Dual Framework for Non-convex Smooth Consensus Optimization

140 - Gabriel Mancino-Ball , Yangyang Xu , Jie Chen 2021

In this work, we introduce ADAPD, $textbf{A}$ $textbf{D}$ecentr$textbf{A}$lized $textbf{P}$rimal-$textbf{D}$ual algorithmic framework for solving non-convex and smooth consensus optimization problems over a network of distributed agents. ADAPD makes use of an inexact ADMM-type update. During each iteration, each agent first inexactly solves a local strongly convex subproblem and then performs a neighbor communication while updating a set of dual variables. Two variations to ADAPD are presented. The variants allow agents to balance the communication and computation workload while they collaboratively solve the consensus optimization problem. The optimal convergence rate for non-convex and smooth consensus optimization problems is established; namely, ADAPD achieves $varepsilon$-stationarity in $mathcal{O}(varepsilon^{-1})$ iterations. Numerical experiments demonstrate the superiority of ADAPD over several existing decentralized methods.

Optimization and Control

Accelerated meta-algorithm for convex optimization

176 - Alexander Gasnikov , Darina Dvinskikh , Pavel Dvurechensky 2020

We propose an accelerated meta-algorithm, which allows to obtain accelerated methods for convex unconstrained minimization in different settings. As an application of the general scheme we propose nearly optimal methods for minimizing smooth functions with Lipschitz derivatives of an arbitrary order, as well as for smooth minimax optimization problems. The proposed meta-algorithm is more general than the ones in the literature and allows to obtain better convergence rates and practical performance in several settings.

Optimization and Control

A New Randomized Block-Coordinate Primal-Dual Proximal Algorithm for Distributed Optimization

117 - Puya Latafat , Nikolaos M. Freris , Panagiotis Patrinos 2017

This paper proposes TriPD, a new primal-dual algorithm for minimizing the sum of a Lipschitz-differentiable convex function and two possibly nonsmooth convex functions, one of which is composed with a linear mapping. We devise a randomized block-coordinate version of the algorithm which converges under the same stepsize conditions as the full algorithm. It is shown that both the original as well as the block-coordinate scheme feature linear convergence rate when the functions involved are either piecewise linear-quadratic, or when they satisfy a certain quadratic growth condition (which is weaker than strong convexity). Moreover, we apply the developed algorithms to the problem of multi-agent optimization on a graph, thus obtaining novel synchronous and asynchronous distributed methods. The proposed algorithms are fully distributed in the sense that the updates and the stepsizes of each agent only depend on local information. In fact, no prior global coordination is required. Finally, we showcase an application of our algorithm in distributed formation control.

Optimization and Control

Primal-dual $varepsilon$-Subgradient Method for Distributed Optimization

253 - Kui Zhu , Yutao Tang 2021

This paper studies the distributed optimization problem where the objective functions might be nondifferentiable and subject to heterogeneous set constraints. Unlike existing subgradient methods, we focus on the case when the exact subgradients of the local objective functions can not be accessed by the agents. To solve this problem, we propose a projected primal-dual dynamics using only the objective functions approximate subgradients. We first prove that the formulated optimization problem can only be solved with an approximate error depending upon the accuracy of the available subgradients. Then, we show the exact solvability of this optimization problem if the accumulated approximation error is not too large. After that, we also give a novel componentwise normalized variant to improve the transient behavior of the convergent sequence. The effectiveness of our algorithms is verified by a numerical example.

Optimization and Control Systems and Control Systems and Control