Decentralized and Model-Free Federated Learning: Consensus-Based Distillation in Function Space

93 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Akihito Taya

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Akihito Taya - Takayuki Nishio - Masahiro Morikura

بنية الشبكات والإنترنت النظم الموزعة والتوازية والحوسبة العنقودية التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This paper proposes a decentralized FL scheme for IoE devices connected via multi-hop networks. FL has gained attention as an enabler of privacy-preserving algorithms, but it is not guaranteed that FL algorithms converge to the optimal point because of non-convexity when using decentralized parameter averaging schemes. Therefore, a distributed algorithm that converges to the optimal solution should be developed. The key idea of the proposed algorithm is to aggregate the local prediction functions, not in a parameter space but in a function space. Since machine learning tasks can be regarded as convex functional optimization problems, a consensus-based optimization algorithm achieves the global optimum if it is tailored to work in a function space. This paper at first analyzes the convergence of the proposed algorithm in a function space, which is referred to as a meta-algorithm. It is shown that spectral graph theory can be applied to the function space in a similar manner as that of numerical vectors. Then, a CMFD is developed for NN as an implementation of the meta-algorithm. CMFD leverages knowledge distillation to realize function aggregation among adjacent devices without parameter averaging. One of the advantages of CMFD is that it works even when NN models are different among the distributed learners. This paper shows that CMFD achieves higher accuracy than parameter aggregation under weakly-connected networks. The stability of CMFD is also higher than that of parameter aggregation methods.

قيم البحث

121 - Zhuangdi Zhu , Junyuan Hong , Jiayu Zhou 2021

Federated Learning (FL) is a decentralized machine-learning paradigm, in which a global server iteratively averages the model parameters of local users without accessing their data. User heterogeneity has imposed significant challenges to FL, which c an incur drifted global models that are slow to converge. Knowledge Distillation has recently emerged to tackle this issue, by refining the server model using aggregated knowledge from heterogeneous users, other than directly averaging their model parameters. This approach, however, depends on a proxy dataset, making it impractical unless such a prerequisite is satisfied. Moreover, the ensemble knowledge is not fully utilized to guide local model learning, which may in turn affect the quality of the aggregated model. Inspired by the prior art, we propose a data-free knowledge distillation} approach to address heterogeneous FL, where the server learns a lightweight generator to ensemble user information in a data-free manner, which is then broadcasted to users, regulating local training using the learned knowledge as an inductive bias. Empirical studies powered by theoretical implications show that, our approach facilitates FL with better generalization performance using fewer communication rounds, compared with the state-of-the-art.

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية

Bandwidth Slicing to Boost Federated Learning in Edge Computing

127 - Jun Li , Xiaoman Shen , Lei Chen 2019

Bandwidth slicing is introduced to support federated learning in edge computing to assure low communication delay for training traffic. Results reveal that bandwidth slicing significantly improves training efficiency while achieving good learning accuracy.

بنية الشبكات والإنترنت النظم الموزعة والتوازية والحوسبة العنقودية التعلم الآلي

Device Sampling for Heterogeneous Federated Learning: Theory, Algorithms, and Implementation

97 - Su Wang , Mengyuan Lee , Seyyedali Hosseinalipour 2021

The conventional federated learning (FedL) architecture distributes machine learning (ML) across worker devices by having them train local models that are periodically aggregated by a server. FedL ignores two important characteristics of contemporary wireless networks, however: (i) the network may contain heterogeneous communication/computation resources, while (ii) there may be significant overlaps in devices local data distributions. In this work, we develop a novel optimization methodology that jointly accounts for these factors via intelligent device sampling complemented by device-to-device (D2D) offloading. Our optimization aims to select the best combination of sampled nodes and data offloading configuration to maximize FedL training accuracy subject to realistic constraints on the network topology and device capabilities. Theoretical analysis of the D2D offloading subproblem leads to new FedL convergence bounds and an efficient sequential convex optimizer. Using this result, we develop a sampling methodology based on graph convolutional networks (GCNs) which learns the relationship between network attributes, sampled nodes, and resulting offloading that maximizes FedL accuracy. Through evaluation on real-world datasets and network measurements from our IoT testbed, we find that our methodology while sampling less than 5% of all devices outperforms conventional FedL substantially both in terms of trained model accuracy and required resource utilization.

بنية الشبكات والإنترنت النظم الموزعة والتوازية والحوسبة العنقودية التعلم الآلي

Decentralized Deep Learning using Momentum-Accelerated Consensus

105 - Aditya Balu , Zhanhong Jiang , Sin Yong Tan 2020

We consider the problem of decentralized deep learning where multiple agents collaborate to learn from a distributed dataset. While there exist several decentralized deep learning approaches, the majority consider a central parameter-server topology for aggregating the model parameters from the agents. However, such a topology may be inapplicable in networked systems such as ad-hoc mobile networks, field robotics, and power network systems where direct communication with the central parameter server may be inefficient. In this context, we propose and analyze a novel decentralized deep learning algorithm where the agents interact over a fixed communication topology (without a central server). Our algorithm is based on the heavy-ball acceleration method used in gradient-based optimization. We propose a novel consensus protocol where each agent shares with its neighbors its model parameters as well as gradient-momentum values during the optimization process. We consider both strongly convex and non-convex objective functions and theoretically analyze our algorithms performance. We present several empirical comparisons with competing decentralized learning methods to demonstrate the efficacy of our approach under different communication topologies.

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية التعلم الالي

Decentralized Federated Learning: Balancing Communication and Computing Costs

149 - Wei Liu , Li Chen , 2021

Decentralized federated learning (DFL) is a powerful framework of distributed machine learning and decentralized stochastic gradient descent (SGD) is a driving engine for DFL. The performance of decentralized SGD is jointly influenced by communicatio n-efficiency and convergence rate. In this paper, we propose a general decentralized federated learning framework to strike a balance between communication-efficiency and convergence performance. The proposed framework performs both multiple local updates and multiple inter-node communications periodically, unifying traditional decentralized SGD methods. We establish strong convergence guarantees for the proposed DFL algorithm without the assumption of convex objective function. The balance of communication and computation rounds is essential to optimize decentralized federated learning under constrained communication and computation resources. For further improving communication-efficiency of DFL, compressed communication is applied to DFL, named DFL with compressed communication (C-DFL). The proposed C-DFL exhibits linear convergence for strongly convex objectives. Experiment results based on MNIST and CIFAR-10 datasets illustrate the superiority of DFL over traditional decentralized SGD methods and show that C-DFL further enhances communication-efficiency.

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية