A Communication-Efficient and Privacy-Aware Distributed Algorithm for Sparse PCA

331 0 0.0 ( 0 )

Download Cite

Added by Lei Wang

Publication date 2021

fields

and research's language is English

Authors Lei Wang - Xin Liu - Yin Zhang

Optimization and Control

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

As a prominent variant of principal component analysis (PCA), sparse PCA attempts to find sparse loading vectors when conducting dimension reduction. This paper aims to calculate sparse PCA through solving an optimization problem pursuing orthogonality and sparsity simultaneously. We propose a splitting and alternating approach, leading to an efficient distributed algorithm, called DAL1, for solving this nonconvex and nonsmooth optimization problem. Convergence of DAL1 to stationary points has been rigorously established. Computational experiments demonstrate that, due to its fast convergence in terms of iteration count, DAL1 requires far fewer rounds of communications to reach the prescribed accuracy than those required by existing peer methods. Unlike existing algorithms, there is a relatively small possibility of data leakage for DAL1.

rate research

A Randomized Rounding Algorithm for Sparse PCA

83 - Kimon Fountoulakis , Abhisek Kundu , Eugenia-Maria Kontopoulou andn Petros Drineas 2015

We present and analyze a simple, two-step algorithm to approximate the optimal solution of the sparse PCA problem. Our approach first solves a L1 penalized version of the NP-hard sparse PCA optimization problem and then uses a randomized rounding strategy to sparsify the resulting dense solution. Our main theoretical result guarantees an additive error approximation and provides a tradeoff between sparsity and accuracy. Our experimental evaluation indicates that our approach is competitive in practice, even compared to state-of-the-art toolboxes such as Spasm.

Data Structures and Algorithms Machine Learning Machine Learning

Walkman: A Communication-Efficient Random-Walk Algorithm for Decentralized Optimization

138 - Xianghui Mao , Kun Yuan , Yubin Hu 2018

This paper addresses consensus optimization problems in a multi-agent network, where all agents collaboratively find a minimizer for the sum of their private functions. We develop a new decentralized algorithm in which each agent communicates only with its neighbors. State-of-the-art decentralized algorithms use communications between either all pairs of adjacent agents or a random subset of them at each iteration. Another class of algorithms uses a random walk incremental strategy, which sequentially activates a succession of nodes; these incremental algorithms require diminishing step sizes to converge to the solution, so their convergence is relatively slow. In this work, we propose a random walk algorithm that uses a fixed step size and converges faster than the existing random walk incremental algorithms. Our algorithm is also communication efficient. Each iteration uses only one link to communicate the latest information for an agent to another. Since this communication rule mimics a man walking around the network, we call our new algorithm Walkman. We establish convergence for convex and nonconvex objectives. For decentralized least squares, we derive a linear rate of convergence and obtain a better communication complexity than those of other decentralized algorithms. Numerical experiments verify our analysis results.

Optimization and Control Distributed Parallel and Cluster Computing Multiagent Systems

Communication-Efficient Distributed Optimization with Quantized Preconditioners

299 - Foivos Alimisis , Peter Davies , Dan Alistarh 2021

We investigate fast and communication-efficient algorithms for the classic problem of minimizing a sum of strongly convex and smooth functions that are distributed among $n$ different nodes, which can communicate using a limited number of bits. Most previous communication-efficient approaches for this problem are limited to first-order optimization, and therefore have emph{linear} dependence on the condition number in their communication complexity. We show that this dependence is not inherent: communication-efficient methods can in fact have sublinear dependence on the condition number. For this, we design and analyze the first communication-efficient distributed variants of preconditioned gradient descent for Generalized Linear Models, and for Newtons method. Our results rely on a new technique for quantizing both the preconditioner and the descent direction at each step of the algorithms, while controlling their convergence rate. We also validate our findings experimentally, showing fast convergence and reduced communication.

Optimization and Control Distributed Parallel and Cluster Computing

Distributed Picard Iteration: Application to Distributed EM and Distributed PCA

57 - Francisco L. Andrade , Mario A. T. Figueiredo , Jo~ao Xavier 2021

In recent work, we proposed a distributed Picard iteration (DPI) that allows a set of agents, linked by a communication network, to find a fixed point of a locally contractive (LC) map that is the average of individual maps held by said agents. In this work, we build upon the DPI and its local linear convergence (LLC) guarantees to make several contributions. We show that Sangers algorithm for principal component analysis (PCA) corresponds to the iteration of an LC map that can be written as the average of local maps, each map known to each agent holding a subset of the data. Similarly, we show that a variant of the expectation-maximization (EM) algorithm for parameter estimation from noisy and faulty measurements in a sensor network can be written as the iteration of an LC map that is the average of local maps, each available at just one node. Consequently, via the DPI, we derive two distributed algorithms - distributed EM and distributed PCA - whose LLC guarantees follow from those that we proved for the DPI. The verification of the LC condition for EM is challenging, as the underlying operator depends on random samples, thus the LC condition is of probabilistic nature.

Optimization and Control Distributed Parallel and Cluster Computing

LASG: Lazily Aggregated Stochastic Gradients for Communication-Efficient Distributed Learning

101 - Tianyi Chen , Yuejiao Sun , Wotao Yin 2020

This paper targets solving distributed machine learning problems such as federated learning in a communication-efficient fashion. A class of new stochastic gradient descent (SGD) approaches have been developed, which can be viewed as the stochastic generalization to the recently developed lazily aggregated gradient (LAG) method --- justifying the name LASG. LAG adaptively predicts the contribution of each round of communication and chooses only the significant ones to perform. It saves communication while also maintains the rate of convergence. However, LAG only works with deterministic gradients, and applying it to stochastic gradients yields poor performance. The key components of LASG are a set of new rules tailored for stochastic gradients that can be implemented either to save download, upload, or both. The new algorithms adaptively choose between fresh and stale stochastic gradients and have convergence rates comparable to the original SGD. LASG achieves impressive empirical performance --- it typically saves total communication by an order of magnitude.

Optimization and Control Machine Learning Machine Learning