Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

InstaHide: Instance-hiding Schemes for Private Distributed Learning

111 0 0.0 ( 0 )

Download Cite

Added by Yangsibo Huang

Publication date 2020

fields Informatics Engineering

and research's language is English

Authors Yangsibo Huang - Zhao Song - Kai Li

Cryptography and Security Computational Complexity Data Structures and Algorithms

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

How can multiple distributed entities collaboratively train a shared deep net on their private data while preserving privacy? This paper introduces InstaHide, a simple encryption of training images, which can be plugged into existing distributed deep learning pipelines. The encryption is efficient and applying it during training has minor effect on test accuracy. InstaHide encrypts each training image with a one-time secret key which consists of mixing a number of randomly chosen images and applying a random pixel-wise mask. Other contributions of this paper include: (a) Using a large public dataset (e.g. ImageNet) for mixing during its encryption, which improves security. (b) Experimental results to show effectiveness in preserving privacy against known attacks with only minor effects on accuracy. (c) Theoretical analysis showing that successfully attacking privacy requires attackers to solve a difficult computational problem. (d) Demonstrating that use of the pixel-wise mask is important for security, since Mixup alone is shown to be insecure to some some efficient attacks. (e) Release of a challenge dataset https://github.com/Hazelsuko07/InstaHide_Challenge Our code is available at https://github.com/Hazelsuko07/InstaHide

rate research

Is Private Learning Possible with Instance Encoding?

263 - Nicholas Carlini , Samuel Deng , Sanjam Garg 2020

A private machine learning algorithm hides as much as possible about its training data while still preserving accuracy. In this work, we study whether a non-private learning algorithm can be made private by relying on an instance-encoding mechanism that modifies the training inputs before feeding them to a normal learner. We formalize both the notion of instance encoding and its privacy by providing two attack models. We first prove impossibility results for achieving a (stronger) model. Next, we demonstrate practical attacks in the second (weaker) attack model on InstaHide, a recent proposal by Huang, Song, Li and Arora [ICML20] that aims to use instance encoding for privacy.

Cryptography and Security Computer Vision and Pattern Recognition Machine Learning

NeuraCrypt: Hiding Private Health Data via Random Neural Networks for Public Training

126 - Adam Yala , Homa Esfahanizadeh , Rafael G. L. D Oliveira 2021

Balancing the needs of data privacy and predictive utility is a central challenge for machine learning in healthcare. In particular, privacy concerns have led to a dearth of public datasets, complicated the construction of multi-hospital cohorts and limited the utilization of external machine learning resources. To remedy this, new methods are required to enable data owners, such as hospitals, to share their datasets publicly, while preserving both patient privacy and modeling utility. We propose NeuraCrypt, a private encoding scheme based on random deep neural networks. NeuraCrypt encodes raw patient data using a randomly constructed neural network known only to the data-owner, and publishes both the encoded data and associated labels publicly. From a theoretical perspective, we demonstrate that sampling from a sufficiently rich family of encoding functions offers a well-defined and meaningful notion of privacy against a computationally unbounded adversary with full knowledge of the underlying data-distribution. We propose to approximate this family of encoding functions through random deep neural networks. Empirically, we demonstrate the robustness of our encoding to a suite of adversarial attacks and show that NeuraCrypt achieves competitive accuracy to non-private baselines on a variety of x-ray tasks. Moreover, we demonstrate that multiple hospitals, using independent private encoders, can collaborate to train improved x-ray models. Finally, we release a challenge dataset to encourage the development of new attacks on NeuraCrypt.

Cryptography and Security Artificial Intelligence

Hiding Data Hiding

146 - Hanzhou Wu , Gen Liu , Xinpeng Zhang 2021

Data hiding is referred to as the art of hiding secret data into a digital cover for covert communication. In this letter, we propose a novel method to disguise data hiding tools, including a data embedding tool and a data extraction tool, as a deep neural network (DNN) with an ordinary task. After training a DNN for both style transfer and data hiding, while the DNN can transfer the style of an image to a target one, it can be also used to hide secret data into a cover image or extract secret data from a stego image by inputting the trigger signal. In other words, the tools of data hiding are hidden to avoid arousing suspicion.

Cryptography and Security Multimedia

Differentially Private Model Publishing for Deep Learning

336 - Lei Yu , Ling Liu , Calton Pu 2019

Deep learning techniques based on neural networks have shown significant success in a wide range of AI tasks. Large-scale training datasets are one of the critical factors for their success. However, when the training datasets are crowdsourced from individuals and contain sensitive information, the model parameters may encode private information and bear the risks of privacy leakage. The recent growing trend of the sharing and publishing of pre-trained models further aggravates such privacy risks. To tackle this problem, we propose a differentially private approach for training neural networks. Our approach includes several new techniques for optimizing both privacy loss and model accuracy. We employ a generalization of differential privacy called concentrated differential privacy(CDP), with both a formal and refined privacy loss analysis on two different data batching methods. We implement a dynamic privacy budget allocator over the course of training to improve model accuracy. Extensive experiments demonstrate that our approach effectively improves privacy loss accounting, training efficiency and model quality under a given privacy budget.

Cryptography and Security Machine Learning

Data-Driven 3D Placement of UAV Base Stations for Arbitrarily Distributed Crowds

175 - Chuan-Chi Lai , Li-Chun Wang , Zhu Han 2019

In this paper, we consider an Unmanned Aerial Vehicle (UAV)-assisted cellular system which consists of multiple UAV base stations (BSs) cooperating the terrestrial BSs. In such a heterogeneous network, for cellular operators, the problem is how to determine the appropriate number, locations, and altitudes of UAV-BSs to improve the system sumrate as well as satisfy the demands of arbitrarily flash crowds on data rates. We propose a data-driven 3D placement of UAV-BSs for providing an effective placement result with a feasible computational cost. The proposed algorithm searches for the appropriate number, location, coverage, and altitude of each UAV-BS in the serving area with the maximized system sumrate in polynomial time so as to guarantee the minimum data rate requirement of UE. The simulation results show that the proposed approach can improve system sumrate in comparison with the case without UAV-BSs.

Networking and Internet Architecture Computational Complexity Data Structures and Algorithms

comments

Fetching comments

National Institute of Business Administration

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

InstaHide: Instance-hiding Schemes for Private Distributed Learning

Ask ChatGPT about the research

No Arabic abstract

Read More