Analysis and Optimal Edge Assignment For Hierarchical Federated Learning on Non-IID Data

443 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Naram Mhaisen

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Naram Mhaisen - Alaa Awad - Amr Mohamed

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Distributed learning algorithms aim to leverage distributed and diverse data stored at users devices to learn a global phenomena by performing training amongst participating devices and periodically aggregating their local models parameters into a global model. Federated learning is a promising paradigm that allows for extending local training among the participant devices before aggregating the parameters, offering better communication efficiency. However, in the cases where the participants data are strongly skewed (i.e., non-IID), the local models can overfit local data, leading to low performing global model. In this paper, we first show that a major cause of the performance drop is the weighted distance between the distribution over classes on users devices and the global distribution. Then, to face this challenge, we leverage the edge computing paradigm to design a hierarchical learning system that performs Federated Gradient Descent on the user-edge layer and Federated Averaging on the edge-cloud layer. In this hierarchical architecture, we formalize and optimize this user-edge assignment problem such that edge-level data distributions turn to be similar (i.e., close to IID), which enhances the Federated Averaging performance. Our experiments on multiple real-world datasets show that the proposed optimized assignment is tractable and leads to faster convergence of models towards a better accuracy value.

قيم البحث

104 - Hangyu Zhu , Jinjin Xu , Shiqing Liu 2021

Federated learning is an emerging distributed machine learning framework for privacy preservation. However, models trained in federated learning usually have worse performance than those trained in the standard centralized learning mode, especially w hen the training data are not independent and identically distributed (Non-IID) on the local devices. In this survey, we pro-vide a detailed analysis of the influence of Non-IID data on both parametric and non-parametric machine learning models in both horizontal and vertical federated learning. In addition, cur-rent research work on handling challenges of Non-IID data in federated learning are reviewed, and both advantages and disadvantages of these approaches are discussed. Finally, we suggest several future research directions before concluding the paper.

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية

RingFed: Reducing Communication Costs in Federated Learning on Non-IID Data

121 - Guang Yang , Ke Mu , Chunhe Song 2021

Federated learning is a widely used distributed deep learning framework that protects the privacy of each client by exchanging model parameters rather than raw data. However, federated learning suffers from high communication costs, as a considerable number of model parameters need to be transmitted many times during the training process, making the approach inefficient, especially when the communication network bandwidth is limited. This article proposes RingFed, a novel framework to reduce communication overhead during the training process of federated learning. Rather than transmitting parameters between the center server and each client, as in original federated learning, in the proposed RingFed, the updated parameters are transmitted between each client in turn, and only the final result is transmitted to the central server, thereby reducing the communication overhead substantially. After several local updates, clients first send their parameters to another proximal client, not to the center server directly, to preaggregate. Experiments on two different public datasets show that RingFed has fast convergence, high model accuracy, and low communication cost.

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية

Byzantine-Robust Federated Learning via Credibility Assessment on Non-IID Data

130 - Kun Zhai , Qiang Ren , Junli Wang 2021

Federated learning is a novel framework that enables resource-constrained edge devices to jointly learn a model, which solves the problem of data protection and data islands. However, standard federated learning is vulnerable to Byzantine attacks, wh ich will cause the global model to be manipulated by the attacker or fail to converge. On non-iid data, the current methods are not effective in defensing against Byzantine attacks. In this paper, we propose a Byzantine-robust framework for federated learning via credibility assessment on non-iid data (BRCA). Credibility assessment is designed to detect Byzantine attacks by combing adaptive anomaly detection model and data verification. Specially, an adaptive mechanism is incorporated into the anomaly detection model for the training and prediction of the model. Simultaneously, a unified update algorithm is given to guarantee that the global model has a consistent direction. On non-iid data, our experiments demonstrate that the BRCA is more robust to Byzantine attacks compared with conventional methods

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية

Semi-Decentralized Federated Edge Learning for Fast Convergence on Non-IID Data

412 - Yuchang Sun , Jiawei Shao , Yuyi Mao 2021

Federated edge learning (FEEL) has emerged as an effective alternative to reduce the large communication latency in Cloud-based machine learning solutions, while preserving data privacy. Unfortunately, the learning performance of FEEL may be compromi sed due to limited training data in a single edge cluster. In this paper, we investigate a novel framework of FEEL, namely semi-decentralized federated edge learning (SD-FEEL). By allowing model aggregation between different edge clusters, SD-FEEL enjoys the benefit of FEEL in reducing training latency and improves the learning performance by accessing richer training data from multiple edge clusters. A training algorithm for SD-FEEL with three main procedures in each round is presented, including local model updates, intra-cluster and inter-cluster model aggregations, and it is proved to converge on non-independent and identically distributed (non-IID) data. We also characterize the interplay between the network topology of the edge servers and the communication overhead of inter-cluster model aggregation on training performance. Experiment results corroborate our analysis and demonstrate the effectiveness of SD-FFEL in achieving fast convergence. Besides, guidelines on choosing critical hyper-parameters of the training algorithm are also provided.

بنية الشبكات والإنترنت التعلم الآلي

CFedAvg: Achieving Efficient Communication and Fast Convergence in Non-IID Federated Learning

106 - Haibo Yang , Jia Liu , Elizabeth S. Bentley 2021

Federated learning (FL) is a prevailing distributed learning paradigm, where a large number of workers jointly learn a model without sharing their training data. However, high communication costs could arise in FL due to large-scale (deep) learning m odels and bandwidth-constrained connections. In this paper, we introduce a communication-efficient algorithmic framework called CFedAvg for FL with non-i.i.d. datasets, which works with general (biased or unbiased) SNR-constrained compressors. We analyze the convergence rate of CFedAvg for non-convex functions with constant and decaying learning rates. The CFedAvg algorithm can achieve an $mathcal{O}(1 / sqrt{mKT} + 1 / T)$ convergence rate with a constant learning rate, implying a linear speedup for convergence as the number of workers increases, where $K$ is the number of local steps, $T$ is the number of total communication rounds, and $m$ is the total worker number. This matches the convergence rate of distributed/federated learning without compression, thus achieving high communication efficiency while not sacrificing learning accuracy in FL. Furthermore, we extend CFedAvg to cases with heterogeneous local steps, which allows different workers to perform a different number of local steps to better adapt to their own circumstances. The interesting observation in general is that the noise/variance introduced by compressors does not affect the overall convergence rate order for non-i.i.d. FL. We verify the effectiveness of our CFedAvg algorithm on three datasets with two gradient compression schemes of different compression ratios.

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية