FLRA: A Reference Architecture for Federated Learning Systems

451 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Sin Kit Lo

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Sin Kit Lo - Qinghua Lu - Hye-Young Paik

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية هندسة البرمجيات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Federated learning is an emerging machine learning paradigm that enables multiple devices to train models locally and formulate a global model, without sharing the clients local data. A federated learning system can be viewed as a large-scale distributed system, involving different components and stakeholders with diverse requirements and constraints. Hence, developing a federated learning system requires both software system design thinking and machine learning knowledge. Although much effort has been put into federated learning from the machine learning perspectives, our previous systematic literature review on the area shows that there is a distinct lack of considerations for software architecture design for federated learning. In this paper, we propose FLRA, a reference architecture for federated learning systems, which provides a template design for federated learning-based solutions. The proposed FLRA reference architecture is based on an extensive review of existing patterns of federated learning systems found in the literature and existing industrial implementation. The FLRA reference architecture consists of a pool of architectural patterns that could address the frequently recurring design problems in federated learning architectures. The FLRA reference architecture can serve as a design guideline to assist architects and developers with practical solutions for their problems, which can be further customised.

قيم البحث

134 - Sin Kit Lo , Qinghua Lu , Liming Zhu 2021

Federated learning has received fast-growing interests from academia and industry to tackle the challenges of data hungriness and privacy in machine learning. A federated learning system can be viewed as a large-scale distributed system with differen t components and stakeholders as numerous client devices participate in federated learning. Designing a federated learning system requires software system design thinking apart from machine learning knowledge. Although much effort has been put into federated learning from the machine learning technique aspects, the software architecture design concerns in building federated learning systems have been largely ignored. Therefore, in this paper, we present a collection of architectural patterns to deal with the design challenges of federated learning systems. Architectural patterns present reusable solutions to a commonly occurring problem within a given context during software architecture design. The presented patterns are based on the results of a systematic literature review and include three client management patterns, four model management patterns, three model training patterns, and four model aggregation patterns. The patterns are associated to the particular state transitions in a federated learning model lifecycle, serving as a guidance for effective use of the patterns in the design of federated learning systems.

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية هندسة البرمجيات

Federated Neural Architecture Search

207 - Mengwei Xu , Yuxin Zhao , Kaigui Bian 2020

To preserve user privacy while enabling mobile intelligence, techniques have been proposed to train deep neural networks on decentralized data. However, training over decentralized data makes the design of neural architecture quite difficult as it al ready was. Such difficulty is further amplified when designing and deploying different neural architectures for heterogeneous mobile platforms. In this work, we propose an automatic neural architecture search into the decentralized training, as a new DNN training paradigm called Federated Neural Architecture Search, namely federated NAS. To deal with the primary challenge of limited on-client computational and communication resources, we present FedNAS, a highly optimized framework for efficient federated NAS. FedNAS fully exploits the key opportunity of insufficient model candidate re-training during the architecture search process, and incorporates three key optimizations: parallel candidates training on partial clients, early dropping candidates with inferior performance, and dynamic round numbers. Tested on large-scale datasets and typical CNN architectures, FedNAS achieves comparable model accuracy as state-of-the-art NAS algorithm that trains models with centralized data, and also reduces the client cost by up to two orders of magnitude compared to a straightforward design of federated NAS.

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية

FedSAE: A Novel Self-Adaptive Federated Learning Framework in Heterogeneous Systems

99 - Li Li , Moming Duan , Duo Liu 2021

Federated Learning (FL) is a novel distributed machine learning which allows thousands of edge devices to train model locally without uploading data concentrically to the server. But since real federated settings are resource-constrained, FL is encou ntered with systems heterogeneity which causes a lot of stragglers directly and then leads to significantly accuracy reduction indirectly. To solve the problems caused by systems heterogeneity, we introduce a novel self-adaptive federated framework FedSAE which adjusts the training task of devices automatically and selects participants actively to alleviate the performance degradation. In this work, we 1) propose FedSAE which leverages the complete information of devices historical training tasks to predict the affordable training workloads for each device. In this way, FedSAE can estimate the reliability of each device and self-adaptively adjust the amount of training load per client in each round. 2) combine our framework with Active Learning to self-adaptively select participants. Then the framework accelerates the convergence of the global model. In our framework, the server evaluates devices value of training based on their training loss. Then the server selects those clients with bigger value for the global model to reduce communication overhead. The experimental result indicates that in a highly heterogeneous system, FedSAE converges faster than FedAvg, the vanilla FL framework. Furthermore, FedSAE outperforms than FedAvg on several federated datasets - FedSAE improves test accuracy by 26.7% and reduces stragglers by 90.3% on average.

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية

A Federated Learning Framework for Healthcare IoT devices

357 - Binhang Yuan , Song Ge , Wenhui Xing 2020

The Internet of Things (IoT) revolution has shown potential to give rise to many medical applications with access to large volumes of healthcare data collected by IoT devices. However, the increasing demand for healthcare data privacy and security ma kes each IoT device an isolated island of data. Further, the limited computation and communication capacity of wearable healthcare devices restrict the application of vanilla federated learning. To this end, we propose an advanced federated learning framework to train deep neural networks, where the network is partitioned and allocated to IoT devices and a centralized server. Then most of the training computation is handled by the powerful server. The sparsification of activations and gradients significantly reduces the communication overhead. Empirical study have suggested that the proposed framework guarantees a low accuracy loss, while only requiring 0.2% of the synchronization traffic in vanilla federated learning.

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية معالجة الإشارات

Communication-Efficient Hierarchical Federated Learning for IoT Heterogeneous Systems with Imbalanced Data

292 - Alaa Awad Abdellatif , Naram Mhaisen , Amr Mohamed 2021

Federated learning (FL) is a distributed learning methodology that allows multiple nodes to cooperatively train a deep learning model, without the need to share their local data. It is a promising solution for telemonitoring systems that demand inten sive data collection, for detection, classification, and prediction of future events, from different locations while maintaining a strict privacy constraint. Due to privacy concerns and critical communication bottlenecks, it can become impractical to send the FL updated models to a centralized server. Thus, this paper studies the potential of hierarchical FL in IoT heterogeneous systems and propose an optimized solution for user assignment and resource allocation on multiple edge nodes. In particular, this work focuses on a generic class of machine learning models that are trained using gradient-descent-based schemes while considering the practical constraints of non-uniformly distributed data across different users. We evaluate the proposed system using two real-world datasets, and we show that it outperforms state-of-the-art FL solutions. In particular, our numerical results highlight the effectiveness of our approach and its ability to provide 4-6% increase in the classification accuracy, with respect to hierarchical FL schemes that consider distance-based user assignment. Furthermore, the proposed approach could significantly accelerate FL training and reduce communication overhead by providing 75-85% reduction in the communication rounds between edge nodes and the centralized server, for the same model accuracy.

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية أنظمة متعددة العملاء