No Arabic abstract
Federated Learning (FL) is a distributed machine learning paradigm where data is distributed among clients who collaboratively train a model in a computation process coordinated by a central server. By assigning a weight to each client based on the proportion of data instances it possesses, the rate of convergence to an accurate joint model can be greatly accelerated. Some previous works studied FL in a Byzantine setting, in which a fraction of the clients may send arbitrary or even malicious information regarding their model. However, these works either ignore the issue of data unbalancedness altogether or assume that client weights are apriori known to the server, whereas, in practice, it is likely that weights will be reported to the server by the clients themselves and therefore cannot be relied upon. We address this issue for the first time by proposing a practical weight-truncation-based preprocessing method and demonstrating empirically that it is able to strike a good balance between model quality and Byzantine robustness. We also establish analytically that our method can be applied to a randomly selected sample of client weights.
Gradient-based training in federated learning is known to be vulnerable to faulty/malicious worker nodes, which are often modeled as Byzantine clients. Previous work either makes use of auxiliary data at parameter server to verify the received gradients or leverages statistic-based methods to identify and remove malicious gradients from Byzantine clients. In this paper, we acknowledge that auxiliary data may not always be available in practice and focus on the statistic-based approach. However, recent work on model poisoning attacks have shown that well-crafted attacks can circumvent most of existing median- and distance-based statistical defense methods, making malicious gradients indistinguishable from honest ones. To tackle this challenge, we show that the element-wise sign of gradient vector can provide valuable insight in detecting model poisoning attacks. Based on our theoretical analysis of state-of-the-art attack, we propose a novel approach, textit{SignGuard}, to enable Byzantine-robust federated learning through collaborative malicious gradient filtering. More precisely, the received gradients are first processed to generate relevant magnitude, sign, and similarity statistics, which are then collaboratively utilized by multiple, parallel filters to eliminate malicious gradients before final aggregation. We further provide theoretical analysis of SignGuard by quantifying its convergence with appropriate choice of learning rate and under non-IID training data. Finally, extensive experiments of image and text classification tasks - including MNIST, Fashion-MNIST, CIFAR-10, and AG-News - are conducted together with recently proposed attacks and defense strategies. The numerical results demonstrate the effectiveness and superiority of our proposed approach.
Federated learning (FL) collaboratively aggregates a shared global model depending on multiple local clients, while keeping the training data decentralized in order to preserve data privacy. However, standard FL methods ignore the noisy client issue, which may harm the overall performance of the aggregated model. In this paper, we first analyze the noisy client statement, and then model noisy clients with different noise distributions (e.g., Bernoulli and truncated Gaussian distributions). To learn with noisy clients, we propose a simple yet effective FL framework, named Federated Noisy Client Learning (Fed-NCL), which is a plug-and-play algorithm and contains two main components: a data quality measurement (DQM) to dynamically quantify the data quality of each participating client, and a noise robust aggregation (NRA) to adaptively aggregate the local models of each client by jointly considering the amount of local training data and the data quality of each client. Our Fed-NCL can be easily applied in any standard FL workflow to handle the noisy client issue. Experimental results on various datasets demonstrate that our algorithm boosts the performances of different state-of-the-art systems with noisy clients.
Federated Learning is a distributed machine learning approach which enables model training without data sharing. In this paper, we propose a new federated learning algorithm, Federated Averaging with Client-level Momentum (FedCM), to tackle problems of partial participation and client heterogeneity in real-world federated learning applications. FedCM aggregates global gradient information in previous communication rounds and modifies client gradient descent with a momentum-like term, which can effectively correct the bias and improve the stability of local SGD. We provide theoretical analysis to highlight the benefits of FedCM. We also perform extensive empirical studies and demonstrate that FedCM achieves superior performance in various tasks and is robust to different levels of client numbers, participation rate and client heterogeneity.
We study robust distributed learning that involves minimizing a non-convex loss function with saddle points. We consider the Byzantine setting where some worker machines have abnormal or even arbitrary and adversarial behavior. In this setting, the Byzantine machines may create fake local minima near a saddle point that is far away from any true local minimum, even when robust gradient estimators are used. We develop ByzantinePGD, a robust first-order algorithm that can provably escape saddle points and fake local minima, and converge to an approximate true local minimizer with low iteration complexity. As a by-product, we give a simpler algorithm and analysis for escaping saddle points in the usual non-Byzantine setting. We further discuss three robust gradient estimators that can be used in ByzantinePGD, including median, trimmed mean, and iterative filtering. We characterize their performance in concrete statistical settings, and argue for their near-optimality in low and high dimensional regimes.
Federated learning is a novel framework that enables resource-constrained edge devices to jointly learn a model, which solves the problem of data protection and data islands. However, standard federated learning is vulnerable to Byzantine attacks, which will cause the global model to be manipulated by the attacker or fail to converge. On non-iid data, the current methods are not effective in defensing against Byzantine attacks. In this paper, we propose a Byzantine-robust framework for federated learning via credibility assessment on non-iid data (BRCA). Credibility assessment is designed to detect Byzantine attacks by combing adaptive anomaly detection model and data verification. Specially, an adaptive mechanism is incorporated into the anomaly detection model for the training and prediction of the model. Simultaneously, a unified update algorithm is given to guarantee that the global model has a consistent direction. On non-iid data, our experiments demonstrate that the BRCA is more robust to Byzantine attacks compared with conventional methods