No Arabic abstract
The emerging paradigm of federated learning (FL) strives to enable collaborative training of deep models on the network edge without centrally aggregating raw data and hence improving data privacy. In most cases, the assumption of independent and identically distributed samples across local clients does not hold for federated learning setups. Under this setting, neural network training performance may vary significantly according to the data distribution and even hurt training convergence. Most of the previous work has focused on a difference in the distribution of labels or client shifts. Unlike those settings, we address an important problem of FL, e.g., different scanners/sensors in medical imaging, different scenery distribution in autonomous driving (highway vs. city), where local clients store examples with different distributions compared to other clients, which we denote as feature shift non-iid. In this work, we propose an effective method that uses local batch normalization to alleviate the feature shift before averaging models. The resulting scheme, called FedBN, outperforms both classical FedAvg, as well as the state-of-the-art for non-iid data (FedProx) on our extensive experiments. These empirical results are supported by a convergence analysis that shows in a simplified setting that FedBN has a faster convergence rate than FedAvg. Code is available at https://github.com/med-air/FedBN.
Federated learning is a novel framework that enables resource-constrained edge devices to jointly learn a model, which solves the problem of data protection and data islands. However, standard federated learning is vulnerable to Byzantine attacks, which will cause the global model to be manipulated by the attacker or fail to converge. On non-iid data, the current methods are not effective in defensing against Byzantine attacks. In this paper, we propose a Byzantine-robust framework for federated learning via credibility assessment on non-iid data (BRCA). Credibility assessment is designed to detect Byzantine attacks by combing adaptive anomaly detection model and data verification. Specially, an adaptive mechanism is incorporated into the anomaly detection model for the training and prediction of the model. Simultaneously, a unified update algorithm is given to guarantee that the global model has a consistent direction. On non-iid data, our experiments demonstrate that the BRCA is more robust to Byzantine attacks compared with conventional methods
Federated learning is an emerging distributed machine learning framework for privacy preservation. However, models trained in federated learning usually have worse performance than those trained in the standard centralized learning mode, especially when the training data are not independent and identically distributed (Non-IID) on the local devices. In this survey, we pro-vide a detailed analysis of the influence of Non-IID data on both parametric and non-parametric machine learning models in both horizontal and vertical federated learning. In addition, cur-rent research work on handling challenges of Non-IID data in federated learning are reviewed, and both advantages and disadvantages of these approaches are discussed. Finally, we suggest several future research directions before concluding the paper.
The success of machine learning applications often needs a large quantity of data. Recently, federated learning (FL) is attracting increasing attention due to the demand for data privacy and security, especially in the medical field. However, the performance of existing FL approaches often deteriorates when there exist domain shifts among clients, and few previous works focus on personalization in healthcare. In this article, we propose FedHealth 2, an extension of FedHealth cite{chen2020fedhealth} to tackle domain shifts and get personalized models for local clients. FedHealth 2 obtains the client similarities via a pretrained model, and then it averages all weighted models with preserving local batch normalization. Wearable activity recognition and COVID-19 auxiliary diagnosis experiments have evaluated that FedHealth 2 can achieve better accuracy (10%+ improvement for activity recognition) and personalized healthcare without compromising privacy and security.
Federated learning enables multiple clients to collaboratively learn a global model by periodically aggregating the clients models without transferring the local data. However, due to the heterogeneity of the system and data, many approaches suffer from the client-drift issue that could significantly slow down the convergence of the global model training. As clients perform local updates on heterogeneous data through heterogeneous systems, their local models drift apart. To tackle this issue, one intuitive idea is to guide the local model training by the global teachers, i.e., past global models, where each client learns the global knowledge from past global models via adaptive knowledge distillation techniques. Coming from these insights, we propose a novel approach for heterogeneous federated learning, namely FedGKD, which fuses the knowledge from historical global models for local training to alleviate the client-drift issue. In this paper, we evaluate FedGKD with extensive experiments on various CV/NLP datasets (i.e., CIFAR-10/100, Tiny-ImageNet, AG News, SST5) and different heterogeneous settings. The proposed method is guaranteed to converge under common assumptions, and achieves superior empirical accuracy in fewer communication runs than five state-of-the-art methods.
Federated learning is a widely used distributed deep learning framework that protects the privacy of each client by exchanging model parameters rather than raw data. However, federated learning suffers from high communication costs, as a considerable number of model parameters need to be transmitted many times during the training process, making the approach inefficient, especially when the communication network bandwidth is limited. This article proposes RingFed, a novel framework to reduce communication overhead during the training process of federated learning. Rather than transmitting parameters between the center server and each client, as in original federated learning, in the proposed RingFed, the updated parameters are transmitted between each client in turn, and only the final result is transmitted to the central server, thereby reducing the communication overhead substantially. After several local updates, clients first send their parameters to another proximal client, not to the center server directly, to preaggregate. Experiments on two different public datasets show that RingFed has fast convergence, high model accuracy, and low communication cost.