Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Communication-Efficient and Personalized Federated Lottery Ticket Learning

81 0 0.0 ( 0 )

Download Cite

Added by Sejin Seo

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Sejin Seo - Seung-Woo Ko - Jihong Park

Machine Learning Information Theory Networking and Internet Architecture

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The lottery ticket hypothesis (LTH) claims that a deep neural network (i.e., ground network) contains a number of subnetworks (i.e., winning tickets), each of which exhibiting identically accurate inference capability as that of the ground network. Federated learning (FL) has recently been applied in LotteryFL to discover such winning tickets in a distributed way, showing higher accuracy multi-task learning than Vanilla FL. Nonetheless, LotteryFL relies on unicast transmission on the downlink, and ignores mitigating stragglers, questioning scalability. Motivated by this, in this article we propose a personalized and communication-efficient federated lottery ticket learning algorithm, coined CELL, which exploits downlink broadcast for communication efficiency. Furthermore, it utilizes a novel user grouping method, thereby alternating between FL and lottery learning to mitigate stragglers. Numerical simulations validate that CELL achieves up to 3.6% higher personalized task classification accuracy with 4.3x smaller total communication cost until convergence under the CIFAR-10 dataset.

rate research

Communication Efficient Distributed Learning with Censored, Quantized, and Generalized Group ADMM

139 - Chaouki Ben Issaid , Anis Elgabli , Jihong Park 2020

In this paper, we propose a communication-efficiently decentralized machine learning framework that solves a consensus optimization problem defined over a network of inter-connected workers. The proposed algorithm, Censored and Quantized Generalized GADMM (CQ-GGADMM), leverages the worker grouping and decentralized learning ideas of Group Alternating Direction Method of Multipliers (GADMM), and pushes the frontier in communication efficiency by extending its applicability to generalized network topologies, while incorporating link censoring for negligible updates after quantization. We theoretically prove that CQ-GGADMM achieves the linear convergence rate when the local objective functions are strongly convex under some mild assumptions. Numerical simulations corroborate that CQ-GGADMM exhibits higher communication efficiency in terms of the number of communication rounds and transmit energy consumption without compromising the accuracy and convergence speed, compared to the censored decentralized ADMM, and the worker grouping method of GADMM.

Machine Learning Information Theory Networking and Internet Architecture

Provably Efficient Lottery Ticket Discovery

83 - Cameron R. Wolfe , Qihan Wang , Junhyung Lyle Kim 2021

The lottery ticket hypothesis (LTH) claims that randomly-initialized, dense neural networks contain (sparse) subnetworks that, when trained an equal amount in isolation, can match the dense networks performance. Although LTH is useful for discovering efficient network architectures, its three-step process -- pre-training, pruning, and re-training -- is computationally expensive, as the dense model must be fully pre-trained. Luckily, early-bird tickets can be discovered within neural networks that are minimally pre-trained, allowing for the creation of efficient, LTH-inspired training procedures. Yet, no theoretical foundation of this phenomenon exists. We derive an analytical bound for the number of pre-training iterations that must be performed for a winning ticket to be discovered, thus providing a theoretical understanding of when and why such early-bird tickets exist. By adopting a greedy forward selection pruning strategy, we directly connect the pruned networks performance to the loss of the dense network from which it was derived, revealing a threshold in the number of pre-training iterations beyond which high-performing subnetworks are guaranteed to exist. We demonstrate the validity of our theoretical results across a variety of architectures and datasets, including multi-layer perceptrons (MLPs) trained on MNIST and several deep convolutional neural network (CNN) architectures trained on CIFAR10 and ImageNet.

Machine Learning Artificial Intelligence Machine Learning

A Generalized Lottery Ticket Hypothesis

299 - Ibrahim Alabdulmohsin , Larisa Markeeva , Daniel Keysers 2021

We introduce a generalization to the lottery ticket hypothesis in which the notion of sparsity is relaxed by choosing an arbitrary basis in the space of parameters. We present evidence that the original results reported for the canonical basis continue to hold in this broader setting. We describe how structured pruning methods, including pruning units or factorizing fully-connected layers into products of low-rank matrices, can be cast as particular instances of this generalized lottery ticket hypothesis. The investigations reported here are preliminary and are provided to encourage further research along this direction.

Machine Learning Computer Vision and Pattern Recognition

Dynamic Attention-based Communication-Efficient Federated Learning

199 - Zihan Chen , Kai Fong Ernest Chong , Tony Q. S. Quek 2021

Federated learning (FL) offers a solution to train a global machine learning model while still maintaining data privacy, without needing access to data stored locally at the clients. However, FL suffers performance degradation when client data distribution is non-IID, and a longer training duration to combat this degradation may not necessarily be feasible due to communication limitations. To address this challenge, we propose a new adaptive training algorithm $texttt{AdaFL}$, which comprises two components: (i) an attention-based client selection mechanism for a fairer training scheme among the clients; and (ii) a dynamic fraction method to balance the trade-off between performance stability and communication efficiency. Experimental results show that our $texttt{AdaFL}$ algorithm outperforms the usual $texttt{FedAvg}$ algorithm, and can be incorporated to further improve various state-of-the-art FL algorithms, with respect to three aspects: model accuracy, performance stability, and communication efficiency.

Machine Learning Distributed Parallel and Cluster Computing

FetchSGD: Communication-Efficient Federated Learning with Sketching

112 - Daniel Rothchild , Ashwinee Panda , Enayat Ullah 2020

Existing approaches to federated learning suffer from a communication bottleneck as well as convergence issues due to sparse client participation. In this paper we introduce a novel algorithm, called FetchSGD, to overcome these challenges. FetchSGD compresses model updates using a Count Sketch, and then takes advantage of the mergeability of sketches to combine model updates from many workers. A key insight in the design of FetchSGD is that, because the Count Sketch is linear, momentum and error accumulation can both be carried out within the sketch. This allows the algorithm to move momentum and error accumulation from clients to the central aggregator, overcoming the challenges of sparse client participation while still achieving high compression rates and good convergence. We prove that FetchSGD has favorable convergence guarantees, and we demonstrate its empirical effectiveness by training two residual networks and a transformer model.

Machine Learning Machine Learning

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Communication-Efficient and Personalized Federated Lottery Ticket Learning

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions