No Arabic abstract
Federated learning (FL) has emerged as a promising master/slave learning paradigm to alleviate systemic privacy risks and communication costs incurred by cloud-centric machine learning methods. However, it is very challenging to resist the single point of failure of the master aggregator and attacks from malicious participants while guaranteeing model convergence speed and accuracy. Recently, blockchain has been brought into FL systems transforming the paradigm to a decentralized manner thus further improve the system security and learning reliability. Unfortunately, the traditional consensus mechanism and architecture of blockchain systems can hardly handle the large-scale FL task due to the huge resource consumption, limited transaction throughput, and high communication complexity. To address these issues, this paper proposes a two-layer blockchaindriven FL framework, called as ChainsFL, which is composed of multiple subchain networks (subchain layer) and a direct acyclic graph (DAG)-based mainchain (mainchain layer). In ChainsFL, the subchain layer limits the scale of each shard for a small range of information exchange, and the mainchain layer allows each shard to share and validate the learning model in parallel and asynchronously to improve the efficiency of cross-shard validation. Furthermore, the FL procedure is customized to deeply integrate with blockchain technology, and the modified DAG consensus mechanism is proposed to mitigate the distortion caused by abnormal models. In order to provide a proof-ofconcept implementation and evaluation, multiple subchains base on Hyperledger Fabric are deployed as the subchain layer, and the self-developed DAG-based mainchain is deployed as the mainchain layer. The experimental results show that ChainsFL provides acceptable and sometimes better training efficiency and stronger robustness compared with the typical existing FL systems.
Recently, a number of backdoor attacks against Federated Learning (FL) have been proposed. In such attacks, an adversary injects poisoned model updates into the federated model aggregation process with the goal of manipulating the aggregated model to provide false predictions on specific adversary-chosen inputs. A number of defenses have been proposed; but none of them can effectively protect the FL process also against so-called multi-backdoor attacks in which multiple different backdoors are injected by the adversary simultaneously without severely impacting the benign performance of the aggregated model. To overcome this challenge, we introduce FLGUARD, a poisoning defense framework that is able to defend FL against state-of-the-art backdoor attacks while simultaneously maintaining the benign performance of the aggregated model. Moreover, FL is also vulnerable to inference attacks, in which a malicious aggregator can infer information about clients training data from their model updates. To thwart such attacks, we augment FLGUARD with state-of-the-art secure computation techniques that securely evaluate the FLGUARD algorithm. We provide formal argumentation for the effectiveness of our FLGUARD and extensively evaluate it against known backdoor attacks on several datasets and applications (including image classification, word prediction, and IoT intrusion detection), demonstrating that FLGUARD can entirely remove backdoors with a negligible effect on accuracy. We also show that private FLGUARD achieves practical runtimes.
The emerging Federated Edge Learning (FEL) technique has drawn considerable attention, which not only ensures good machine learning performance but also solves data island problems caused by data privacy concerns. However, large-scale FEL still faces following crucial challenges: (i) there lacks a secure and communication-efficient model training scheme for FEL; (2) there is no scalable and flexible FEL framework for updating local models and global model sharing (trading) management. To bridge the gaps, we first propose a blockchain-empowered secure FEL system with a hierarchical blockchain framework consisting of a main chain and subchains. This framework can achieve scalable and flexible decentralized FEL by individually manage local model updates or model sharing records for performance isolation. A Proof-of-Verifying consensus scheme is then designed to remove low-quality model updates and manage qualified model updates in a decentralized and secure manner, thereby achieving secure FEL. To improve communication efficiency of the blockchain-empowered FEL, a gradient compression scheme is designed to generate sparse but important gradients to reduce communication overhead without compromising accuracy, and also further strengthen privacy preservation of training data. The security analysis and numerical results indicate that the proposed schemes can achieve secure, scalable, and communication-efficient decentralized FEL.
Cryptocurrencies, implemented with blockchain protocols, promise to become a global payment system if they can overcome performance limitations. Rapidly advancing architectures improve on latency and throughput, but most require all participating servers to process all transactions. Several recent works propose to shard the system, such that each machine would only process a subset of the transactions. However, we identify a denial-of-service attack that is exposed by these solutions - an attacker can generate transactions that would overload a single shard, thus delaying processing in the entire system. Moreover, we show that in common scenarios, these protocols require most node operators to process almost all blockchain transactions. We present Ostraka, a blockchain node architecture that shards (parallelizes) the nodes themselves. We prove that replacing a unified node with an Ostraka node does not affect the security of the underlying consensus mechanism. We evaluate analytically and experimentally block propagation and processing in various settings. Ostraka allows nodes in the network to scale, without costly coordination. In our experiments, Ostraka nodes transaction processing rate grows linearly with the addition of resources.
Blockchain is an incrementally updated ledger maintained by distributed nodes rather than centralized organizations. The current blockchain technology faces scalability issues, which include two aspects: low transaction throughput and high storage capacity costs. This paper studies the blockchain structure based on state sharding technology, and mainly solves the problem of non-scalability of block chain storage. This paper designs and implements the blockchain state sharding scheme, proposes a specific state sharding data structure and algorithm implementation, and realizes a complete blockchain structure so that the blockchain has the advantages of high throughput, processing a large number of transactions and saving storage costs. Experimental results show that a blockchain network with more than 100,000 nodes can be divided into 1024 shards. A blockchain network with this structure can process 500,000 transactions in about 5 seconds. If the consensus time of the blockchain is about 10 seconds, and the block generation time of the blockchain system of the sharding mechanism is 15 seconds, the transaction throughput can reach 33,000 tx/sec. Experimental results show that the throughput of the proposed protocol increases with the increase of the network node size. This confirms the scalability of the blockchain structure based on sharding technology.
Secure federated learning is a privacy-preserving framework to improve machine learning models by training over large volumes of data collected by mobile users. This is achieved through an iterative process where, at each iteration, users update a global model using their local datasets. Each user then masks its local model via random keys, and the masked models are aggregated at a central server to compute the global model for the next iteration. As the local models are protected by random masks, the server cannot observe their true values. This presents a major challenge for the resilience of the model against adversarial (Byzantine) users, who can manipulate the global model by modifying their local models or datasets. Towards addressing this challenge, this paper presents the first single-server Byzantine-resilient secure aggregation framework (BREA) for secure federated learning. BREA is based on an integrated stochastic quantization, verifiable outlier detection, and secure model aggregation approach to guarantee Byzantine-resilience, privacy, and convergence simultaneously. We provide theoretical convergence and privacy guarantees and characterize the fundamental trade-offs in terms of the network size, user dropouts, and privacy protection. Our experiments demonstrate convergence in the presence of Byzantine users, and comparable accuracy to conventional federated learning benchmarks.