VFL: A Verifiable Federated Learning with Privacy-Preserving for Big Data in Industrial IoT

311 0 0.0 ( 0 )

Download Cite

Added by Anmin Fu

Publication date 2020

fields Informatics Engineering

and research's language is English

Authors Anmin Fu - Xianglong Zhang - Naixue Xiong

Cryptography and Security

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Due to the strong analytical ability of big data, deep learning has been widely applied to train the collected data in industrial IoT. However, for privacy issues, traditional data-gathering centralized learning is not applicable to industrial scenarios sensitive to training sets. Recently, federated learning has received widespread attention, since it trains a model by only relying on gradient aggregation without accessing training sets. But existing researches reveal that the shared gradient still retains the sensitive information of the training set. Even worse, a malicious aggregation server may return forged aggregated gradients. In this paper, we propose the VFL, verifiable federated learning with privacy-preserving for big data in industrial IoT. Specifically, we use Lagrange interpolation to elaborately set interpolation points for verifying the correctness of the aggregated gradients. Compared with existing schemes, the verification overhead of VFL remains constant regardless of the number of participants. Moreover, we employ the blinding technology to protect the privacy of the gradients submitted by the participants. If no more than n-2 of n participants collude with the aggregation server, VFL could guarantee the encrypted gradients of other participants not being inverted. Experimental evaluations corroborate the practical performance of the presented VFL framework with high accuracy and efficiency.

rate research

Differential Privacy-based Permissioned Blockchain for Private Data Sharing in Industrial IoT

99 - Muhammad Islam Swinburne University of Technology 2021

Permissioned blockchain such as Hyperledger fabric enables a secure supply chain model in Industrial Internet of Things (IIoT) through multichannel and private data collection mechanisms. Sharing of Industrial data including private data exchange at every stage between supply chain partners helps to improve product quality, enable future forecast, and enhance management activities. However, the existing data sharing and querying mechanism in Hyperledger fabric is not suitable for supply chain environment in IIoT because the queries are evaluated on actual data stored on ledger which consists of sensitive information such as business secrets, and special discounts offered to retailers and individuals. To solve this problem, we propose a differential privacy-based permissioned blockchain using Hyperledger fabric to enable private data sharing in supply chain in IIoT (DH-IIoT). We integrate differential privacy into the chaindcode (smart contract) of Hyperledger fabric to achieve privacy preservation. As a result, the query response consists of perturbed data which protects the sensitive information in the ledger. The proposed work (DH-IIoT) is evaluated by simulating a permissioned blockchain using Hyperledger fabric. We compare our differential privacy integrated chaincode of Hyperledger fabric with the default chaincode setting of Hyperledger fabric for supply chain scenario. The results confirm that the proposed work maintains 96.15% of accuracy in the shared data while guarantees the protection of sensitive ledgers data.

Cryptography and Security

POSEIDON: Privacy-Preserving Federated Neural Network Learning

91 - Sinem Sav , Apostolos Pyrgelis , Juan R. Troncoso-Pastoriza 2020

In this paper, we address the problem of privacy-preserving training and evaluation of neural networks in an $N$-party, federated learning setting. We propose a novel system, POSEIDON, the first of its kind in the regime of privacy-preserving neural network training. It employs multiparty lattice-based cryptography to preserve the confidentiality of the training data, the model, and the evaluation data, under a passive-adversary model and collusions between up to $N-1$ parties. To efficiently execute the secure backpropagation algorithm for training neural networks, we provide a generic packing approach that enables Single Instruction, Multiple Data (SIMD) operations on encrypted data. We also introduce arbitrary linear transformations within the cryptographic bootstrapping operation, optimizing the costly cryptographic computations over the parties, and we define a constrained optimization problem for choosing the cryptographic parameters. Our experimental results show that POSEIDON achieves accuracy similar to centralized or decentralized non-private approaches and that its computation and communication overhead scales linearly with the number of parties. POSEIDON trains a 3-layer neural network on the MNIST dataset with 784 features and 60K samples distributed among 10 parties in less than 2 hours.

Cryptography and Security Machine Learning

Robust Aggregation for Adaptive Privacy Preserving Federated Learning in Healthcare

212 - Matei Grama , Maria Musat , Luis Mu~noz-Gonzalez 2020

Federated learning (FL) has enabled training models collaboratively from multiple data owning parties without sharing their data. Given the privacy regulations of patients healthcare data, learning-based systems in healthcare can greatly benefit from privacy-preserving FL approaches. However, typical model aggregation methods in FL are sensitive to local model updates, which may lead to failure in learning a robust and accurate global model. In this work, we implement and evaluate different robust aggregation methods in FL applied to healthcare data. Furthermore, we show that such methods can detect and discard faulty or malicious local clients during training. We run two sets of experiments using two real-world healthcare datasets for training medical diagnosis classification tasks. Each dataset is used to simulate the performance of three different robust FL aggregation strategies when facing different poisoning attacks. The results show that privacy preserving methods can be successfully applied alongside Byzantine-robust aggregation techniques. We observed in particular how using differential privacy (DP) did not significantly impact the final learning convergence of the different aggregation strategies.

Cryptography and Security

HybridAlpha: An Efficient Approach for Privacy-Preserving Federated Learning

127 - Runhua Xu , Nathalie Baracaldo , Yi Zhou 2019

Federated learning has emerged as a promising approach for collaborative and privacy-preserving learning. Participants in a federated learning process cooperatively train a model by exchanging model parameters instead of the actual training data, which they might want to keep private. However, parameter interaction and the resulting model still might disclose information about the training data used. To address these privacy concerns, several approaches have been proposed based on differential privacy and secure multiparty computation (SMC), among others. They often result in large communication overhead and slow training time. In this paper, we propose HybridAlpha, an approach for privacy-preserving federated learning employing an SMC protocol based on functional encryption. This protocol is simple, efficient and resilient to participants dropping out. We evaluate our approach regarding the training time and data volume exchanged using a federated learning process to train a CNN on the MNIST data set. Evaluation against existing crypto-based SMC solutions shows that HybridAlpha can reduce the training time by 68% and data transfer volume by 92% on average while providing the same model performance and privacy guarantees as the existing solutions.

Cryptography and Security Machine Learning

Privacy Preserving Vertical Federated Learning for Tree-based Models

125 - Yuncheng Wu , Shaofeng Cai , Xiaokui Xiao 2020

Federated learning (FL) is an emerging paradigm that enables multiple organizations to jointly train a model without revealing their private data to each other. This paper studies {it vertical} federated learning, which tackles the scenarios where (i) collaborating organizations own data of the same set of users but with disjoint features, and (ii) only one organization holds the labels. We propose Pivot, a novel solution for privacy preserving vertical decision tree training and prediction, ensuring that no intermediate information is disclosed other than those the clients have agreed to release (i.e., the final tree model and the prediction output). Pivot does not rely on any trusted third party and provides protection against a semi-honest adversary that may compromise $m-1$ out of $m$ clients. We further identify two privacy leakages when the trained decision tree model is released in plaintext and propose an enhanced protocol to mitigate them. The proposed solution can also be extended to tree ensemble models, e.g., random forest (RF) and gradient boosting decision tree (GBDT) by treating single decision trees as building blocks. Theoretical and experimental analysis suggest that Pivot is efficient for the privacy achieved.

Cryptography and Security Machine Learning