Utility-Preserving Privacy Mechanisms for Counting Queries

108 0 0.0 ( 0 )

Download Cite

Added by Catuscia Palamidessi

Publication date 2019

fields Informatics Engineering

and research's language is English

Authors Natasha Fernandes - Kacem Lefki - Catuscia Palamidessi

Cryptography and Security

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Differential privacy (DP) and local differential privacy (LPD) are frameworks to protect sensitive information in data collections. They are both based on obfuscation. In DP the noise is added to the result of queries on the dataset, whereas in LPD the noise is added directly on the individual records, before being collected. The main advantage of LPD with respect to DP is that it does not need to assume a trusted third party. The main disadvantage is that the trade-off between privacy and utility is usually worse than in DP, and typically to retrieve reasonably good statistics from the locally sanitized data it is necessary to have a huge collection of them. In this paper, we focus on the problem of estimating counting queries from collections of noisy answers, and we propose a variant of LDP based on the addition of geometric noise. Our main result is that the geometric noise has a better statistical utility than other LPD mechanisms from the literature.

rate research

Utility-aware Privacy-preserving Data Releasing

275 - Di Zhuang , J. Morris Chang 2020

In the big data era, more and more cloud-based data-driven applications are developed that leverage individual data to provide certain valuable services (the utilities). On the other hand, since the same set of individual data could be utilized to infer the individuals certain sensitive information, it creates new channels to snoop the individuals privacy. Hence it is of great importance to develop techniques that enable the data owners to release privatized data, that can still be utilized for certain premised intended purpose. Existing data releasing approaches, however, are either privacy-emphasized (no consideration on utility) or utility-driven (no guarantees on privacy). In this work, we propose a two-step perturbation-based utility-aware privacy-preserving data releasing framework. First, certain predefined privacy and utility problems are learned from the public domain data (background knowledge). Later, our approach leverages the learned knowledge to precisely perturb the data owners data into privatized data that can be successfully utilized for certain intended purpose (learning to succeed), without jeopardizing certain predefined privacy (training to fail). Extensive experiments have been conducted on Human Activity Recognition, Census Income and Bank Marketing datasets to demonstrate the effectiveness and practicality of our framework.

Cryptography and Security Artificial Intelligence Machine Learning

Utility-aware and privacy-preserving mobile query services

88 - Emre Yigitoglu , Mehmet Emre Gursoy , Ling Liu 2019

Location-based queries enable fundamental services for mobile road network travelers. While the benefits of location-based services (LBS) are numerous, exposure of mobile travelers location information to untrusted LBS providers may lead to privacy breaches. In this paper, we propose StarCloak, a utility-aware and attack-resilient approach to building a privacy-preserving query system for mobile users traveling on road networks. StarCloak has several desirable properties. First, StarCloak supports user-defined k-user anonymity and l-segment indistinguishability, along with user-specified spatial and temporal utility constraints, for utility-aware and personalized location privacy. Second, unlike conventional solutions which are indifferent to underlying road network structure, StarCloak uses the concept of stars and proposes cloaking graphs for effective location cloaking on road networks. Third, StarCloak achieves strong attack-resilience against replay and query injection-based attacks through randomized star selection and pruning. Finally, to enable scalable query processing with high throughput, StarCloak makes cost-aware star selection decisions by considering query evaluation and network communication costs. We evaluate StarCloak on two real-world road network datasets under various privacy and utility constraints. Results show that StarCloak achieves improved query success rate and throughput, reduced anonymization time and network usage, and higher attack-resilience in comparison to XStar, its most relevant competitor.

Cryptography and Security

The Laplace Mechanism has optimal utility for differential privacy over continuous queries

145 - Natasha Fernandes , Annabelle McIver , Carroll Morgan 2021

Differential Privacy protects individuals data when statistical queries are published from aggregated databases: applying obfuscating mechanisms to the query results makes the released information less specific but, unavoidably, also decreases its utility. Yet it has been shown that for discrete data (e.g. counting queries), a mandated degree of privacy and a reasonable interpretation of loss of utility, the Geometric obfuscating mechanism is optimal: it loses as little utility as possible. For continuous query results however (e.g. real numbers) the optimality result does not hold. Our contribution here is to show that optimality is regained by using the Laplace mechanism for the obfuscation. The technical apparatus involved includes the earlier discrete result by Ghosh et al., recent work on abstract channels and their geometric representation as hyper-distributions, and the dual interpretations of distance between distributions provided by the Kantorovich-Rubinstein Theorem.

Cryptography and Security

Universally Utility-Maximizing Privacy Mechanisms

701 - Arpita Ghosh , Tim Roughgarden , Mukund Sundararajan 2009

A mechanism for releasing information about a statistical database with sensitive data must resolve a trade-off between utility and privacy. Privacy can be rigorously quantified using the framework of {em differential privacy}, which requires that a mechanisms output distribution is nearly the same whether or not a given database row is included or excluded. The goal of this paper is strong and general utility guarantees, subject to differential privacy. We pursue mechanisms that guarantee near-optimal utility to every potential user, independent of its side information (modeled as a prior distribution over query results) and preferences (modeled via a loss function). Our main result is: for each fixed count query and differential privacy level, there is a {em geometric mechanism} $M^*$ -- a discrete variant of the simple and well-studied Laplace mechanism -- that is {em simultaneously expected loss-minimizing} for every possible user, subject to the differential privacy constraint. This is an extremely strong utility guarantee: {em every} potential user $u$, no matter what its side information and preferences, derives as much utility from $M^*$ as from interacting with a differentially private mechanism $M_u$ that is optimally tailored to $u$.

Databases Computer Science and Game Theory

Privacy and Utility Preserving Sensor-Data Transformations

377 - Mohammad Malekzadeh , Richard G. Clegg , Andrea Cavallaro 2019

Sensitive inferences and user re-identification are major threats to privacy when raw sensor data from wearable or portable devices are shared with cloud-assisted applications. To mitigate these threats, we propose mechanisms to transform sensor data before sharing them with applications running on users devices. These transformations aim at eliminating patterns that can be used for user re-identification or for inferring potentially sensitive activities, while introducing a minor utility loss for the target application (or task). We show that, on gesture and activity recognition tasks, we can prevent inference of potentially sensitive activities while keeping the reduction in recognition accuracy of non-sensitive activities to less than 5 percentage points. We also show that we can reduce the accuracy of user re-identification and of the potential inference of gender to the level of a random guess, while keeping the accuracy of activity recognition comparable to that obtained on the original data.

Machine Learning Human-Computer Interaction Signal Processing