Convergence Analysis and System Design for Federated Learning over Wireless Networks

266 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Shuo Wan

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Shuo Wan - Jiaxun Lu - Pingyi Fan

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية أنظمة متعددة العملاء

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Federated learning (FL) has recently emerged as an important and promising learning scheme in IoT, enabling devices to jointly learn a model without sharing their raw data sets. However, as the training data in FL is not collected and stored centrally, FL training requires frequent model exchange, which is largely affected by the wireless communication network. Therein, limited bandwidth and random package loss restrict interactions in training. Meanwhile, the insufficient message synchronization among distributed clients could also affect FL convergence. In this paper, we analyze the convergence rate of FL training considering the joint impact of communication network and training settings. Further by considering the training costs in terms of time and power, the optimal scheduling problems for communication networks are formulated. The developed theoretical results can be used to assist the system parameter selections and explain the principle of how the wireless communication system could influence the distributed training process and network scheduling.

قيم البحث

456 - Canh T. Dinh , Nguyen H. Tran , Minh N. H. Nguyen 2019

There is an increasing interest in a fast-growing machine learning technique called Federated Learning, in which the model training is distributed over mobile user equipments (UEs), exploiting UEs local computation and training data. Despite its adva ntages in data privacy-preserving, Federated Learning (FL) still has challenges in heterogeneity across UEs data and physical resources. We first propose a FL algorithm which can handle the heterogeneous UEs data challenge without further assumptions except strongly convex and smooth loss functions. We provide the convergence rate characterizing the trade-off between local computation rounds of UE to update its local model and global communication rounds to update the FL global model. We then employ the proposed FL algorithm in wireless networks as a resource allocation optimization problem that captures the trade-off between the FL convergence wall clock time and energy consumption of UEs with heterogeneous computing and power resources. Even though the wireless resource allocation problem of FL is non-convex, we exploit this problems structure to decompose it into three sub-problems and analyze their closed-form solutions as well as insights to problem design. Finally, we illustrate the theoretical analysis for the new algorithm with Tensorflow experiments and extensive numerical results for the wireless resource allocation sub-problems. The experiment results not only verify the theoretical convergence but also show that our proposed algorithm outperforms the vanilla FedAvg algorithm in terms of convergence rate and testing accuracy.

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية بنية الشبكات والإنترنت

On the Convergence Time of Federated Learning Over Wireless Networks Under Imperfect CSI

83 - Francesco Pase , Marco Giordani , Michele Zorzi 2021

Federated learning (FL) has recently emerged as an attractive decentralized solution for wireless networks to collaboratively train a shared model while keeping data localized. As a general approach, existing FL methods tend to assume perfect knowled ge of the Channel State Information (CSI) during the training phase, which may not be easy to acquire in case of fast fading channels. Moreover, literature analyses either consider a fixed number of clients participating in the training of the federated model, or simply assume that all clients operate at the maximum achievable rate to transmit model data. In this paper, we fill these gaps by proposing a training process that takes channel statistics as a bias to minimize the convergence time under imperfect CSI. Numerical experiments demonstrate that it is possible to reduce the training time by neglecting model updates from clients that cannot sustain a minimum predefined transmission rate. We also examine the trade-off between number of clients involved in the training process and model accuracy as a function of different fading regimes.

التعلم الآلي

Local Averaging Helps: Hierarchical Federated Learning and Convergence Analysis

265 - Jiayi Wang , Shiqiang Wang , Rong-Rong Chen 2020

Federated learning is an effective approach to realize collaborative learning among edge devices without exchanging raw data. In practice, these devices may connect to local hubs instead of connecting to the global server (aggregator) directly. Due t o the (possibly limited) computation capability of these local hubs, it is reasonable to assume that they can perform simple averaging operations. A natural question is whether such local averaging is beneficial under different system parameters and how much gain can be obtained compared to the case without such averaging. In this paper, we study hierarchical federated learning with stochastic gradient descent (HF-SGD) and conduct a thorough theoretical analysis to analyze its convergence behavior. In particular, we first consider the two-level HF-SGD (one level of local averaging) and then extend this result to arbitrary number of levels (multiple levels of local averaging). The analysis demonstrates the impact of local averaging precisely as a function of system parameters. Due to the higher communication cost of global averaging, a strategy of decreasing the global averaging frequency and increasing the local averaging frequency is proposed. Experiments validate the proposed theoretical analysis and the advantages of HF-SGD.

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية نظرية المعلومات

Communication Efficient Federated Learning with Energy Awareness over Wireless Networks

144 - Richeng Jin , Xiaofan He , Huaiyu Dai 2020

In federated learning (FL), reducing the communication overhead is one of the most critical challenges since the parameter server and the mobile devices share the training parameters over wireless links. With such consideration, we adopt the idea of SignSGD in which only the signs of the gradients are exchanged. Moreover, most of the existing works assume Channel State Information (CSI) available at both the mobile devices and the parameter server, and thus the mobile devices can adopt fixed transmission rates dictated by the channel capacity. In this work, only the parameter server side CSI is assumed, and channel capacity with outage is considered. In this case, an essential problem for the mobile devices is to select appropriate local processing and communication parameters (including the transmission rates) to achieve a desired balance between the overall learning performance and their energy consumption. Two optimization problems are formulated and solved, which optimize the learning performance given the energy consumption requirement, and vice versa. Furthermore, considering that the data may be distributed across the mobile devices in a highly uneven fashion in FL, a stochastic sign-based algorithm is proposed. Extensive simulations are performed to demonstrate the effectiveness of the proposed methods.

التعلم الآلي معالجة الإشارات التعلم الالي

FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning Convergence Analysis

99 - Baihe Huang , Xiaoxiao Li , Zhao Song 2021

Federated Learning (FL) is an emerging learning scheme that allows different distributed clients to train deep neural networks together without data sharing. Neural networks have become popular due to their unprecedented success. To the best of our k nowledge, the theoretical guarantees of FL concerning neural networks with explicit forms and multi-step updates are unexplored. Nevertheless, training analysis of neural networks in FL is non-trivial for two reasons: first, the objective loss function we are optimizing is non-smooth and non-convex, and second, we are even not updating in the gradient direction. Existing convergence results for gradient descent-based methods heavily rely on the fact that the gradient direction is used for updating. This paper presents a new class of convergence analysis for FL, Federated Learning Neural Tangent Kernel (FL-NTK), which corresponds to overparamterized ReLU neural networks trained by gradient descent in FL and is inspired by the analysis in Neural Tangent Kernel (NTK). Theoretically, FL-NTK converges to a global-optimal solution at a linear rate with properly tuned learning parameters. Furthermore, with proper distributional assumptions, FL-NTK can also achieve good generalization.

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية التعلم الالي