No Arabic abstract
In recent years, variance-reducing stochastic methods have shown great practical performance, exhibiting linear convergence rate when other stochastic methods offered a sub-linear rate. However, as datasets grow ever bigger and clusters become widespread, the need for fast distribution methods is pressing. We propose here a distribution scheme for SAGA which maintains a linear convergence rate, even when communication between nodes is limited.
Information compression is essential to reduce communication cost in distributed optimization over peer-to-peer networks. This paper proposes a communication-efficient linearly convergent distributed (COLD) algorithm to solve strongly convex optimization problems. By compressing innovation vectors, which are the differences between decision vectors and their estimates, COLD is able to achieve linear convergence for a class of $delta$-contracted compressors. We explicitly quantify how the compression affects the convergence rate and show that COLD matches the same rate of its uncompressed version. To accommodate a wider class of compressors that includes the binary quantizer, we further design a novel dynamical scaling mechanism and obtain the linearly convergent Dyna-COLD. Importantly, our results strictly improve existing results for the quantized consensus problem. Numerical experiments demonstrate the advantages of both algorithms under different compressors.
The Barzilai-Borwein (BB) method has demonstrated great empirical success in nonlinear optimization. However, the convergence speed of BB method is not well understood, as the known convergence rate of BB method for quadratic problems is much worse than the steepest descent (SD) method. Therefore, there is a large discrepancy between theory and practice. To shrink this gap, we prove that the BB method converges $R$-linearly at a rate of $1-1/kappa$, where $kappa$ is the condition number, for strongly convex quadratic problems. In addition, an example with the theoretical rate of convergence is constructed, indicating the tightness of our bound.
Communication efficiency is a major bottleneck in the applications of distributed networks. To address the problem, the problem of quantized distributed optimization has attracted a lot of attention. However, most of the existing quantized distributed optimization algorithms can only converge sublinearly. To achieve linear convergence, this paper proposes a novel quantized distributed gradient tracking algorithm (Q-DGT) to minimize a finite sum of local objective functions over directed networks. Moreover, we explicitly derive the update rule for the number of quantization levels, and prove that Q-DGT can converge linearly even when the exchanged variables are respectively one bit. Numerical results also confirm the efficiency of the proposed algorithm.
The present paper considers leveraging network topology information to improve the convergence rate of ADMM for decentralized optimization, where networked nodes work collaboratively to minimize the objective. Such problems can be solved efficiently using ADMM via decomposing the objective into easier subproblems. Properly exploiting network topology can significantly improve the algorithm performance. Hybrid ADMM explores the direction of exploiting node information by taking into account node centrality but fails to utilize edge information. This paper fills the gap by incorporating both node and edge information and provides a novel convergence rate bound for decentralized ADMM that explicitly depends on network topology. Such a novel bound is attainable for certain class of problems, thus tight. The explicit dependence further suggests possible directions to optimal design of edge weights to achieve the best performance. Numerical experiments show that simple heuristic methods could achieve better performance, and also exhibits robustness to topology changes.
Recently, distributed dual averaging has received increasing attention due to its superiority in handling constraints and dynamic networks in multiagent optimization. However, all distributed dual averaging methods reported so far considered nonsmooth problems and have a convergence rate of $O(frac{1}{sqrt{t}})$. To achieve an improved convergence guarantee for smooth problems, this work proposes a second-order consensus scheme that assists each agent to locally track the global dual variable more accurately. This new scheme in conjunction with smoothness of the objective ensures that the accumulation of consensus error over time caused by incomplete global information is bounded from above. Then, a rigorous investigation of dual averaging with inexact gradient oracles is carried out to compensate the consensus error and achieve an $O(frac{1}{t})$ convergence rate. The proposed method is examined in a large-scale LASSO problem.