A geometrically converging dual method for distributed optimization over time-varying graphs

157 0 0.0 ( 0 )

Download Cite

Added by Marie Maros

Publication date 2018

fields

and research's language is English

Authors Marie Maros - Joakim Jalden

Optimization and Control

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In this paper we consider a distributed convex optimization problem over time-varying undirected networks. We propose a dual method, primarily averaged network dual ascent (PANDA), that is proven to converge R-linearly to the optimal point given that the agents objective functions are strongly convex and have Lipschitz continuous gradients. Like dual decomposition, PANDA requires half the amount of variable exchanges per iterate of methods based on DIGing, and can provide with practical improved performance as empirically demonstrated.

rate research

PANDA: A Dual Linearly Converging Method for Distributed Optimization over Time-Varying Undirected Graphs

171 - Marie Maros , Joakim Jalden 2018

In this paper we consider a distributed convex optimization problem over time-varying networks. We propose a dual method that converges R-linearly to the optimal point given that the agents objective functions are strongly convex and have Lipschitz continuous gradients. The proposed method requires half the amount of variable exchanges per iterate than methods based on DIGing, and yields improved practical performance as empirically demonstrated.

Optimization and Control

Eco-PANDA: A Computationally Economic, Geometrically Converging, Dual Optimization Method on Time-Varying Undirected Graphs

153 - Marie Maros , Joakim Jalden 2018

In this paper we consider distributed convex optimization over time-varying undirected graphs. We propose a linearized version of primarily averaged network dual ascent (PANDA) while requiring less computational costs. The proposed method, economic primarily averaged network dual ascent (Eco-PANDA), provably converges at R-linear rate to the optimal point given that the agents objective functions are strongly convex and have Lipschitz continuous gradients. Therefore, the method is competitive, in terms of type of rate, with both DIGing and PANDA. The proposed method halves the communication costs of methods like DIGing while still converging R-linearly and having the same per iterate complexity.

Optimization and Control

Distributed Regularized Dual Gradient Algorithm for Constrained Convex Optimization over Time-Varying Directed Graphs

101 - Chuanye Gu , Zhiyou Wu , Jueyou Li 2018

We investigate a distributed optimization problem over a cooperative multi-agent time-varying network, where each agent has its own decision variables that should be set so as to minimize its individual objective subject to local constraints and global coupling constraints. Based on push-sum protocol and dual decomposition, we design a distributed regularized dual gradient algorithm to solve this problem, in which the algorithm is implemented in time-varying directed graphs only requiring the column stochasticity of communication matrices. By augmenting the corresponding Lagrangian function with a quadratic regularization term, we first obtain the bound of the Lagrangian multipliers which does not require constructing a compact set containing the dual optimal set when compared with most of primal-dual based methods. Then, we obtain that the convergence rate of the proposed method can achieve the order of $mathcal{O}(ln T/T)$ for strongly convex objective functions, where $T$ is the iterations. Moreover, the explicit bound of constraint violations is also given. Finally, numerical results on the network utility maximum problem are used to demonstrate the efficiency of the proposed algorithm.

Optimization and Control

Distributed Convex Optimization With Coupling Constraints Over Time-Varying Directed Graphs

190 - Chuanye Gu , Zhiyou Wu , Jueyou Li 2018

This paper considers a distributed convex optimization problem over a time-varying multi-agent network, where each agent has its own decision variables that should be set so as to minimize its individual objective subject to local constraints and global coupling equality constraints. Over directed graphs, a distributed algorithm is proposed that incorporates the push-sum protocol into dual subgradient methods. Under the convexity assumption, the optimality of primal and dual variables, and constraint violations is first established. Then the explicit convergence rates of the proposed algorithm are obtained. Finally, some numerical experiments on the economic dispatch problem are provided to demonstrate the efficacy of the proposed algorithm.

Optimization and Control

Accelerated Gradient Tracking over Time-varying Graphs for Decentralized Optimization

79 - Huan Li , Zhouchen Lin 2021

Decentralized optimization over time-varying graphs has been increasingly common in modern machine learning with massive data stored on millions of mobile devices, such as in federated learning. This paper revisits the widely used accelerated gradient tracking and extends it to time-varying graphs. We prove the $O((frac{gamma}{1-sigma_{gamma}})^2sqrt{frac{L}{epsilon}})$ and $O((frac{gamma}{1-sigma_{gamma}})^{1.5}sqrt{frac{L}{mu}}logfrac{1}{epsilon})$ complexities for the practical single loop accelerated gradient tracking over time-varying graphs when the problems are nonstrongly convex and strongly convex, respectively, where $gamma$ and $sigma_{gamma}$ are two common constants charactering the network connectivity, $epsilon$ is the desired precision, and $L$ and $mu$ are the smoothness and strong convexity constants, respectively. Our complexities improve significantly over the ones of $O(frac{1}{epsilon^{5/7}})$ and $O((frac{L}{mu})^{5/7}frac{1}{(1-sigma)^{1.5}}logfrac{1}{epsilon})$, respectively, which were proved in the original literature only for static graphs, where $frac{1}{1-sigma}$ equals $frac{gamma}{1-sigma_{gamma}}$ when the network is time-invariant. When combining with a multiple consensus subroutine, the dependence on the network connectivity constants can be further improved to $O(1)$ and $O(frac{gamma}{1-sigma_{gamma}})$ for the computation and communication complexities, respectively. When the network is static, by employing the Chebyshev acceleration, our complexities exactly match the lower bounds without hiding any poly-logarithmic factor for both nonstrongly convex and strongly convex problems.

Optimization and Control Machine Learning