ﻻ يوجد ملخص باللغة العربية
Preserving privacy of continuous and/or high-dimensional data such as images, videos and audios, can be challenging with syntactic anonymization methods which are designed for discrete attributes. Differential privacy, which provides a more formal definition of privacy, has shown more success in sanitizing continuous data. However, both syntactic and differential privacy are susceptible to inference attacks, i.e., an adversary can accurately infer sensitive attributes from sanitized data. The paper proposes a novel filter-based mechanism which preserves privacy of continuous and high-dimensional attributes against inference attacks. Finding the optimal utility-privacy tradeoff is formulated as a min-diff-max optimization problem. The paper provides an ERM-like analysis of the generalization error and also a practical algorithm to perform the optimization. In addition, the paper proposes an extension that combines minimax filter and differentially-private noisy mechanism. Advantages of the method over purely noisy mechanisms is explained and demonstrated with examples. Experiments with several real-world tasks including facial expression classification, speech emotion classification, and activity classification from motion, show that the minimax filter can simultaneously achieve similar or better target task accuracy and lower inference accuracy, often significantly lower than previous methods.
This paper investigates capabilities of Privacy-Preserving Deep Learning (PPDL) mechanisms against various forms of privacy attacks. First, we propose to quantitatively measure the trade-off between model accuracy and privacy losses incurred by recon
A membership inference attack (MIA) against a machine-learning model enables an attacker to determine whether a given data record was part of the models training data or not. In this paper, we provide an in-depth study of the phenomenon of disparate
Powered by machine learning services in the cloud, numerous learning-driven mobile applications are gaining popularity in the market. As deep learning tasks are mostly computation-intensive, it has become a trend to process raw data on devices and se
Membership inference attack aims to identify whether a data sample was used to train a machine learning model or not. It can raise severe privacy risks as the membership can reveal an individuals sensitive information. For example, identifying an ind
Log-loss (also known as cross-entropy loss) metric is ubiquitously used across machine learning applications to assess the performance of classification algorithms. In this paper, we investigate the problem of inferring the labels of a dataset from s