No Arabic abstract
In federated learning (FL), reducing the communication overhead is one of the most critical challenges since the parameter server and the mobile devices share the training parameters over wireless links. With such consideration, we adopt the idea of SignSGD in which only the signs of the gradients are exchanged. Moreover, most of the existing works assume Channel State Information (CSI) available at both the mobile devices and the parameter server, and thus the mobile devices can adopt fixed transmission rates dictated by the channel capacity. In this work, only the parameter server side CSI is assumed, and channel capacity with outage is considered. In this case, an essential problem for the mobile devices is to select appropriate local processing and communication parameters (including the transmission rates) to achieve a desired balance between the overall learning performance and their energy consumption. Two optimization problems are formulated and solved, which optimize the learning performance given the energy consumption requirement, and vice versa. Furthermore, considering that the data may be distributed across the mobile devices in a highly uneven fashion in FL, a stochastic sign-based algorithm is proposed. Extensive simulations are performed to demonstrate the effectiveness of the proposed methods.
This work investigates fault-resilient federated learning when the data samples are non-uniformly distributed across workers, and the number of faulty workers is unknown to the central server. In the presence of adversarially faulty workers who may strategically corrupt datasets, the local messages exchanged (e.g., local gradients and/or local model parameters) can be unreliable, and thus the vanilla stochastic gradient descent (SGD) algorithm is not guaranteed to converge. Recently developed algorithms improve upon vanilla SGD by providing robustness to faulty workers at the price of slowing down convergence. To remedy this limitation, the present work introduces a fault-resilient proximal gradient (FRPG) algorithm that relies on Nesterovs acceleration technique. To reduce the communication overhead of FRPG, a local (L) FRPG algorithm is also developed to allow for intermittent server-workers parameter exchanges. For strongly convex loss functions, FRPG and LFRPG have provably faster convergence rates than a benchmark robust stochastic aggregation algorithm. Moreover, LFRPG converges faster than FRPG while using the same communication rounds. Numerical tests performed on various real datasets confirm the accelerated convergence of FRPG and LFRPG over the robust stochastic aggregation benchmark and competing alternatives.
This paper studies a federated edge learning system, in which an edge server coordinates a set of edge devices to train a shared machine learning model based on their locally distributed data samples. During the distributed training, we exploit the joint communication and computation design for improving the system energy efficiency, in which both the communication resource allocation for global ML parameters aggregation and the computation resource allocation for locally updating MLparameters are jointly optimized. In particular, we consider two transmission protocols for edge devices to upload ML parameters to edge server, based on the non orthogonal multiple access and time division multiple access, respectively. Under both protocols, we minimize the total energy consumption at all edge devices over a particular finite training duration subject to a given training accuracy, by jointly optimizing the transmission power and rates at edge devices for uploading MLparameters and their central processing unit frequencies for local update. We propose efficient algorithms to optimally solve the formulated energy minimization problems by using the techniques from convex optimization. Numerical results show that as compared to other benchmark schemes, our proposed joint communication and computation design significantly improves the energy efficiency of the federated edge learning system, by properly balancing the energy tradeoff between communication and computation.
Existing approaches to federated learning suffer from a communication bottleneck as well as convergence issues due to sparse client participation. In this paper we introduce a novel algorithm, called FetchSGD, to overcome these challenges. FetchSGD compresses model updates using a Count Sketch, and then takes advantage of the mergeability of sketches to combine model updates from many workers. A key insight in the design of FetchSGD is that, because the Count Sketch is linear, momentum and error accumulation can both be carried out within the sketch. This allows the algorithm to move momentum and error accumulation from clients to the central aggregator, overcoming the challenges of sparse client participation while still achieving high compression rates and good convergence. We prove that FetchSGD has favorable convergence guarantees, and we demonstrate its empirical effectiveness by training two residual networks and a transformer model.
There is an increasing interest in a fast-growing machine learning technique called Federated Learning, in which the model training is distributed over mobile user equipments (UEs), exploiting UEs local computation and training data. Despite its advantages in data privacy-preserving, Federated Learning (FL) still has challenges in heterogeneity across UEs data and physical resources. We first propose a FL algorithm which can handle the heterogeneous UEs data challenge without further assumptions except strongly convex and smooth loss functions. We provide the convergence rate characterizing the trade-off between local computation rounds of UE to update its local model and global communication rounds to update the FL global model. We then employ the proposed FL algorithm in wireless networks as a resource allocation optimization problem that captures the trade-off between the FL convergence wall clock time and energy consumption of UEs with heterogeneous computing and power resources. Even though the wireless resource allocation problem of FL is non-convex, we exploit this problems structure to decompose it into three sub-problems and analyze their closed-form solutions as well as insights to problem design. Finally, we illustrate the theoretical analysis for the new algorithm with Tensorflow experiments and extensive numerical results for the wireless resource allocation sub-problems. The experiment results not only verify the theoretical convergence but also show that our proposed algorithm outperforms the vanilla FedAvg algorithm in terms of convergence rate and testing accuracy.
Petabytes of data are generated each day by emerging Internet of Things (IoT), but only few of them can be finally collected and used for Machine Learning (ML) purposes due to the apprehension of data & privacy leakage, which seriously retarding MLs growth. To alleviate this problem, Federated learning is proposed to perform model training by multiple clients combined data without the dataset sharing within the cluster. Nevertheless, federated learning introduces massive communication overhead as the synchronized data in each epoch is of the same size as the model, and thereby leading to a low communication efficiency. Consequently, variant methods mainly focusing on the communication rounds reduction and data compression are proposed to reduce the communication overhead of federated learning. In this paper, we propose Overlap-FedAvg, a framework that parallels the model training phase with model uploading & downloading phase, so that the latter phase can be totally covered by the former phase. Compared to vanilla FedAvg, Overlap-FedAvg is further developed with a hierarchical computing strategy, a data compensation mechanism and a nesterov accelerated gradients~(NAG) algorithm. Besides, Overlap-FedAvg is orthogonal to many other compression methods so that they can be applied together to maximize the utilization of the cluster. Furthermore, the theoretical analysis is provided to prove the convergence of the proposed Overlap-FedAvg framework. Extensive experiments on both conventional and recurrent tasks with multiple models and datasets also demonstrate that the proposed Overlap-FedAvg framework substantially boosts the federated learning process.