GTG-Shapley: Efficient and Accurate Participant Contribution Evaluation in Federated Learning

88 0 0.0 ( 0 )

Download Cite

Added by Zelei Liu

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Zelei Liu - Yuanyuan Chen - Han Yu

Artificial Intelligence

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Federated Learning (FL) bridges the gap between collaborative machine learning and preserving data privacy. To sustain the long-term operation of an FL ecosystem, it is important to attract high quality data owners with appropriate incentive schemes. As an important building block of such incentive schemes, it is essential to fairly evaluate participants contribution to the performance of the final FL model without exposing their private data. Shapley Value (SV)-based techniques have been widely adopted to provide fair evaluation of FL participant contributions. However, existing approaches incur significant computation costs, making them difficult to apply in practice. In this paper, we propose the Guided Truncation Gradient Shapley (GTG-Shapley) approach to address this challenge. It reconstructs FL models from gradient updates for SV calculation instead of repeatedly training with different combinations of FL participants. In addition, we design a guided Monte Carlo sampling approach combined with within-round and between-round truncation to further reduce the number of model reconstructions and evaluations required, through extensive experiments under diverse realistic data distribution settings. The results demonstrate that GTG-Shapley can closely approximate actual Shapley values, while significantly increasing computational efficiency compared to the state of the art, especially under non-i.i.d. settings.

rate research

Efficient Client Contribution Evaluation for Horizontal Federated Learning

329 - Jie Zhao , Xinghua Zhu , Jianzong Wang 2021

In federated learning (FL), fair and accurate measurement of the contribution of each federated participant is of great significance. The level of contribution not only provides a rational metric for distributing financial benefits among federated participants, but also helps to discover malicious participants that try to poison the FL framework. Previous methods for contribution measurement were based on enumeration over possible combination of federated participants. Their computation costs increase drastically with the number of participants or feature dimensions, making them inapplicable in practical situations. In this paper an efficient method is proposed to evaluate the contributions of federated participants. This paper focuses on the horizontal FL framework, where client servers calculate parameter gradients over their local data, and upload the gradients to the central server. Before aggregating the client gradients, the central server train a data value estimator of the gradients using reinforcement learning techniques. As shown by experimental results, the proposed method consistently outperforms the conventional leave-one-out method in terms of valuation authenticity as well as time complexity.

Machine Learning Audio and Speech Processing

FedCCEA : A Practical Approach of Client Contribution Evaluation for Federated Learning

100 - Sung Kuk Shyn , Donghee Kim , 2021

Client contribution evaluation, also known as data valuation, is a crucial approach in federated learning(FL) for client selection and incentive allocation. However, due to restrictions of accessibility of raw data, only limited information such as local weights and local data size of each client is open for quantifying the client contribution. Using data size from available information, we introduce an empirical evaluation method called Federated Client Contribution Evaluation through Accuracy Approximation(FedCCEA). This method builds the Accuracy Approximation Model(AAM), which estimates a simulated test accuracy using inputs of sampled data size and extracts the clients data quality and data size to measure client contribution. FedCCEA strengthens some advantages: (1) enablement of data size selection to the clients, (2) feasible evaluation time regardless of the number of clients, and (3) precise estimation in non-IID settings. We demonstrate the superiority of FedCCEA compared to previous methods through several experiments: client contribution distribution, client removal, and robustness test to partial participation.

Machine Learning Artificial Intelligence Distributed Parallel and Cluster Computing

Shapley Counterfactual Credits for Multi-Agent Reinforcement Learning

171 - Jiahui Li , Kun Kuang , Baoxiang Wang 2021

Centralized Training with Decentralized Execution (CTDE) has been a popular paradigm in cooperative Multi-Agent Reinforcement Learning (MARL) settings and is widely used in many real applications. One of the major challenges in the training process is credit assignment, which aims to deduce the contributions of each agent according to the global rewards. Existing credit assignment methods focus on either decomposing the joint value function into individual value functions or measuring the impact of local observations and actions on the global value function. These approaches lack a thorough consideration of the complicated interactions among multiple agents, leading to an unsuitable assignment of credit and subsequently mediocre results on MARL. We propose Shapley Counterfactual Credit Assignment, a novel method for explicit credit assignment which accounts for the coalition of agents. Specifically, Shapley Value and its desired properties are leveraged in deep MARL to credit any combinations of agents, which grants us the capability to estimate the individual credit for each agent. Despite this capability, the main technical difficulty lies in the computational complexity of Shapley Value who grows factorially as the number of agents. We instead utilize an approximation method via Monte Carlo sampling, which reduces the sample complexity while maintaining its effectiveness. We evaluate our method on StarCraft II benchmarks across different scenarios. Our method outperforms existing cooperative MARL algorithms significantly and achieves the state-of-the-art, with especially large margins on tasks with more severe difficulties.

Artificial Intelligence Multiagent Systems

Gradient Disaggregation: Breaking Privacy in Federated Learning by Reconstructing the User Participant Matrix

66 - Maximilian Lam , Gu-Yeon Wei , David Brooks 2021

We show that aggregated model updates in federated learning may be insecure. An untrusted central server may disaggregate user updates from sums of updates across participants given repeated observations, enabling the server to recover privileged information about individual users private training data via traditional gradient inference attacks. Our method revolves around reconstructing participant information (e.g: which rounds of training users participated in) from aggregated model updates by leveraging summary information from device analytics commonly used to monitor, debug, and manage federated learning systems. Our attack is parallelizable and we successfully disaggregate user updates on settings with up to thousands of participants. We quantitatively and qualitatively demonstrate significant improvements in the capability of various inference attacks on the disaggregated updates. Our attack enables the attribution of learned properties to individual users, violating anonymity, and shows that a determined central server may undermine the secure aggregation protocol to break individual users data privacy in federated learning.

Cryptography and Security Artificial Intelligence

Efficient and Less Centralized Federated Learning

90 - Li Chou , Zichang Liu , Zhuang Wang 2021

With the rapid growth in mobile computing, massive amounts of data and computing resources are now located at the edge. To this end, Federated learning (FL) is becoming a widely adopted distributed machine learning (ML) paradigm, which aims to harness this expanding skewed data locally in order to develop rich and informative models. In centralized FL, a collection of devices collaboratively solve a ML task under the coordination of a central server. However, existing FL frameworks make an over-simplistic assumption about network connectivity and ignore the communication bandwidth of the different links in the network. In this paper, we present and study a novel FL algorithm, in which devices mostly collaborate with other devices in a pairwise manner. Our nonparametric approach is able to exploit network topology to reduce communication bottlenecks. We evaluate our approach on various FL benchmarks and demonstrate that our method achieves 10X better communication efficiency and around 8% increase in accuracy compared to the centralized approach.

Distributed Parallel and Cluster Computing

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

GTG-Shapley: Efficient and Accurate Participant Contribution Evaluation in Federated Learning

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions