No Arabic abstract
This paper proposes TriPD, a new primal-dual algorithm for minimizing the sum of a Lipschitz-differentiable convex function and two possibly nonsmooth convex functions, one of which is composed with a linear mapping. We devise a randomized block-coordinate version of the algorithm which converges under the same stepsize conditions as the full algorithm. It is shown that both the original as well as the block-coordinate scheme feature linear convergence rate when the functions involved are either piecewise linear-quadratic, or when they satisfy a certain quadratic growth condition (which is weaker than strong convexity). Moreover, we apply the developed algorithms to the problem of multi-agent optimization on a graph, thus obtaining novel synchronous and asynchronous distributed methods. The proposed algorithms are fully distributed in the sense that the updates and the stepsizes of each agent only depend on local information. In fact, no prior global coordination is required. Finally, we showcase an application of our algorithm in distributed formation control.
This paper investigates accelerating the convergence of distributed optimization algorithms on non-convex problems. We propose a distributed primal-dual stochastic gradient descent~(SGD) equipped with powerball method to accelerate. We show that the proposed algorithm achieves the linear speedup convergence rate $mathcal{O}(1/sqrt{nT})$ for general smooth (possibly non-convex) cost functions. We demonstrate the efficiency of the algorithm through numerical experiments by training two-layer fully connected neural networks and convolutional neural networks on the MNIST dataset to compare with state-of-the-art distributed SGD algorithms and centralized SGD algorithms.
We consider a distributed optimization problem over a network of agents aiming to minimize a global objective function that is the sum of local convex and composite cost functions. To this end, we propose a distributed Chebyshev-accelerated primal-dual algorithm to achieve faster ergodic convergence rates. In standard distributed primal-dual algorithms, the speed of convergence towards a global optimum (i.e., a saddle point in the corresponding Lagrangian function) is directly influenced by the eigenvalues of the Laplacian matrix representing the communication graph. In this paper, we use Chebyshev matrix polynomials to generate gossip matrices whose spectral properties result in faster convergence speeds, while allowing for a fully distributed implementation. As a result, the proposed algorithm requires fewer gradient updates at the cost of additional rounds of communications between agents. We illustrate the performance of the proposed algorithm in a distributed signal recovery problem. Our simulations show how the use of Chebyshev matrix polynomials can be used to improve the convergence speed of a primal-dual algorithm over communication networks, especially in networks with poor spectral properties, by trading local computation by communication rounds.
This paper studies the distributed optimization problem where the objective functions might be nondifferentiable and subject to heterogeneous set constraints. Unlike existing subgradient methods, we focus on the case when the exact subgradients of the local objective functions can not be accessed by the agents. To solve this problem, we propose a projected primal-dual dynamics using only the objective functions approximate subgradients. We first prove that the formulated optimization problem can only be solved with an approximate error depending upon the accuracy of the available subgradients. Then, we show the exact solvability of this optimization problem if the accumulated approximation error is not too large. After that, we also give a novel componentwise normalized variant to improve the transient behavior of the convergent sequence. The effectiveness of our algorithms is verified by a numerical example.
This work studies multi-agent sharing optimization problems with the objective function being the sum of smooth local functions plus a convex (possibly non-smooth) function coupling all agents. This scenario arises in many machine learning and engineering applications, such as regression over distributed features and resource allocation. We reformulate this problem into an equivalent saddle-point problem, which is amenable to decentralized solutions. We then propose a proximal primal-dual algorithm and establish its linear convergence to the optimal solution when the local functions are strongly-convex. To our knowledge, this is the first linearly convergent decentralized algorithm for multi-agent sharing problems with a general convex (possibly non-smooth) coupling function.
We consider the zeroth-order optimization problem in the huge-scale setting, where the dimension of the problem is so large that performing even basic vector operations on the decision variables is infeasible. In this paper, we propose a novel algorithm, coined ZO-BCD, that exhibits favorable overall query complexity and has a much smaller per-iteration computational complexity. In addition, we discuss how the memory footprint of ZO-BCD can be reduced even further by the clever use of circulant measurement matrices. As an application of our new method, we propose the idea of crafting adversarial attacks on neural network based classifiers in a wavelet domain, which can result in problem dimensions of over 1.7 million. In particular, we show that crafting adversarial examples to audio classifiers in a wavelet domain can achieve the state-of-the-art attack success rate of 97.9%.