A first look into the carbon footprint of federated learning

222 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Xinchi Qiu

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Xinchi Qiu - Titouan Parcollet - Javier Fernandez-Marques

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Despite impressive results, deep learning-based technologies also raise severe privacy and environmental concerns induced by the training procedure often conducted in datacenters. In response, alternatives to centralized training such as Federated Learning (FL) have emerged. Perhaps unexpectedly, FL, in particular, is starting to be deployed at a global scale by companies that must adhere to new legal demands and policies originating from governments and civil society for privacy protection. However, the potential environmental impact related to FL remains unclear and unexplored. This paper offers the first-ever systematic study of the carbon footprint of FL. First, we propose a rigorous model to quantify the carbon footprint, hence facilitating the investigation of the relationship between FL design and carbon emissions. Then, we compare the carbon footprint of FL to traditional centralized learning. Our findings show that FL, despite being slower to converge in some cases, may result in a comparatively greener impact than a centralized equivalent setup. We performed extensive experiments across different types of datasets, settings, and various deep learning models with FL. Finally, we highlight and connect the reported results to the future challenges and trends in FL to reduce its environmental impact, including algorithms efficiency, hardware capabilities, and stronger industry transparency.

قيم البحث

95 - Michael Zhang , Karan Sapra , Sanja Fidler 2020

While federated learning traditionally aims to train a single global model across decentralized local datasets, one model may not always be ideal for all participating clients. Here we propose an alternative, where each client only federates with oth er relevant clients to obtain a stronger model per client-specific objectives. To achieve this personalization, rather than computing a single model average with constant weights for the entire federation as in traditional FL, we efficiently calculate optimal weighted model combinations for each client, based on figuring out how much a client can benefit from anothers model. We do not assume knowledge of any underlying data distributions or client similarities, and allow each client to optimize for arbitrary target distributions of interest, enabling greater flexibility for personalization. We evaluate and characterize our method on a variety of federated settings, datasets, and degrees of local data heterogeneity. Our method outperforms existing alternatives, while also enabling new features for personalized FL such as transfer outside of local data distributions.

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية التعلم الالي

Federated Reconstruction: Partially Local Federated Learning

98 - Karan Singhal , Hakim Sidahmed , Zachary Garrett 2021

Personalization methods in federated learning aim to balance the benefits of federated and local training for data availability, communication cost, and robustness to client heterogeneity. Approaches that require clients to communicate all model para meters can be undesirable due to privacy and communication constraints. Other approaches require always-available or stateful clients, impractical in large-scale cross-device settings. We introduce Federated Reconstruction, the first model-agnostic framework for partially local federated learning suitable for training and inference at scale. We motivate the framework via a connection to model-agnostic meta learning, empirically demonstrate its performance over existing approaches for collaborative filtering and next word prediction, and release an open-source library for evaluating approaches in this setting. We also describe the successful deployment of this approach at scale for federated collaborative filtering in a mobile keyboard application.

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية

Federated Graph Learning -- A Position Paper

253 - Huanding Zhang , Tao Shen , Fei Wu 2021

Graph neural networks (GNN) have been successful in many fields, and derived various researches and applications in real industries. However, in some privacy sensitive scenarios (like finance, healthcare), training a GNN model centrally faces challen ges due to the distributed data silos. Federated learning (FL) is a an emerging technique that can collaboratively train a shared model while keeping the data decentralized, which is a rational solution for distributed GNN training. We term it as federated graph learning (FGL). Although FGL has received increasing attention recently, the definition and challenges of FGL is still up in the air. In this position paper, we present a categorization to clarify it. Considering how graph data are distributed among clients, we propose four types of FGL: inter-graph FL, intra-graph FL and graph-structured FL, where intra-graph is further divided into horizontal and vertical FGL. For each type of FGL, we make a detailed discussion about the formulation and applications, and propose some potential challenges.

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية بنية الشبكات والإنترنت

Anarchic Federated Learning

254 - Haibo Yang , Xin Zhang , Prashant Khanduri 2021

Present-day federated learning (FL) systems deployed over edge networks have to consistently deal with a large number of workers with high degrees of heterogeneity in data and/or computing capabilities. This diverse set of workers necessitates the de velopment of FL algorithms that allow: (1) flexible worker participation that grants the workers capability to engage in training at will, (2) varying number of local updates (based on computational resources) at each worker along with asynchronous communication with the server, and (3) heterogeneous data across workers. To address these challenges, in this work, we propose a new paradigm in FL called ``Anarchic Federated Learning (AFL). In stark contrast to conventional FL models, each worker in AFL has complete freedom to choose i) when to participate in FL, and ii) the number of local steps to perform in each round based on its current situation (e.g., battery level, communication channels, privacy concerns). However, AFL also introduces significant challenges in algorithmic design because the server needs to handle the chaotic worker behaviors. Toward this end, we propose two Anarchic FedAvg-like algorithms with two-sided learning rates for both cross-device and cross-silo settings, which are named AFedAvg-TSLR-CD and AFedAvg-TSLR-CS, respectively. For general worker information arrival processes, we show that both algorithms retain the highly desirable linear speedup effect in the new AFL paradigm. Moreover, we show that our AFedAvg-TSLR algorithmic framework can be viewed as a {em meta-algorithm} for AFL in the sense that they can utilize advanced FL algorithms as worker- and/or server-side optimizers to achieve enhanced performance under AFL. We validate the proposed algorithms with extensive experiments on real-world datasets.

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية

VAFL: a Method of Vertical Asynchronous Federated Learning

334 - Tianyi Chen , Xiao Jin , Yuejiao Sun 2020

Horizontal Federated learning (FL) handles multi-client data that share the same set of features, and vertical FL trains a better predictor that combine all the features from different clients. This paper targets solving vertical FL in an asynchronou s fashion, and develops a simple FL method. The new method allows each client to run stochastic gradient algorithms without coordination with other clients, so it is suitable for intermittent connectivity of clients. This method further uses a new technique of perturbed local embedding to ensure data privacy and improve communication efficiency. Theoretically, we present the convergence rate and privacy level of our method for strongly convex, nonconvex and even nonsmooth objectives separately. Empirically, we apply our method to FL on various image and healthcare datasets. The results compare favorably to centralized and synchronous FL methods.

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية التحسين والتحكم