Privacy-Protecting Techniques for Behavioral Data: A Survey

112 0 0.0 ( 0 )

Download Cite

Added by Simon Hanisch

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Simon Hanisch - Patricia Arias-Cabarcos - Javier Parra-Arnau

Cryptography and Security

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Our behavior (the way we talk, walk, or think) is unique and can be used as a biometric trait. It also correlates with sensitive attributes like emotions. Hence, techniques to protect individuals privacy against unwanted inferences are required. To consolidate knowledge in this area, we systematically reviewed applicable anonymization techniques. We taxonomize and compare existing solutions regarding privacy goals, conceptual operation, advantages, and limitations. Our analysis shows that some behavioral traits (e.g., voice) have received much attention, while others (e.g., eye-gaze, brainwaves) are mostly neglected. We also find that the evaluation methodology of behavioral anonymization techniques can be further improved.

rate research

Privacy Preserving Techniques Applied to CPNI Data: Analysis and Recommendations

92 - Jeffrey Murray Jr , Afra Mashhadi , Brent Lagesse 2021

With mobile phone penetration rates reaching 90%, Consumer Proprietary Network Information (CPNI) can offer extremely valuable information to different sectors, including policymakers. Indeed, as part of CPNI, Call Detail Records have been successfully used to provide real-time traffic information, to improve our understanding of the dynamics of peoples mobility and so to allow prevention and measures in fighting infectious diseases, and to offer population statistics. While there is no doubt of the usefulness of CPNI data, privacy concerns regarding sharing individuals data have prevented it from being used to its full potential. Traditional de-anonymization measures, such as pseudonymization and standard de-identification, have been shown to be insufficient to protect privacy. This has been specifically shown on mobile phone datasets. As an example, researchers have shown that with only four data points of approximate place and time information of a user, 95% of users could be re-identified in a dataset of 1.5 million mobile phone users. In this landscape paper, we will discuss the state-of-the-art anonymization techniques and their shortcomings.

Cryptography and Security

Classification and Evaluation the Privacy Preserving Data Mining Techniques by using a Data Modification-based Framework

403 - MohammadReza Keyvanpour 2011

In recent years, the data mining techniques have met a serious challenge due to the increased concerning and worries of the privacy, that is, protecting the privacy of the critical and sensitive data. Different techniques and algorithms have been already presented for Privacy Preserving data mining, which could be classified in three common approaches: Data modification approach, Data sanitization approach and Secure Multi-party Computation approach. This paper presents a Data modification- based Framework for classification and evaluation of the privacy preserving data mining techniques. Based on our framework the techniques are divided into two major groups, namely perturbation approach and anonymization approach. Also in proposed framework, eight functional criteria will be used to analyze and analogically assessment of the techniques in these two major groups. The proposed framework provides a good basis for more accurate comparison of the given techniques to privacy preserving data mining. In addition, this framework allows recognizing the overlapping amount for different approaches and identifying modern approaches in this field.

Cryptography and Security

Interval Privacy: A Framework for Data Collection

272 - Jie Ding , Bangjun Ding 2021

The emerging public awareness and government regulations of data privacy motivate new paradigms of collecting and analyzing data transparent and acceptable to data owners. We present a new concept of privacy and corresponding data formats, mechanisms, and tradeoffs for privatizing data during data collection. The privacy, named Interval Privacy, enforces the raw data conditional distribution on the privatized data to be the same as its unconditional distribution over a nontrivial support set. Correspondingly, the proposed privacy mechanism will record each data value as a random interval containing it. The proposed interval privacy mechanisms can be easily deployed through most existing survey-based data collection paradigms, e.g., by asking a respondent whether its data value is within a randomly generated range. Another unique feature of interval mechanisms is that they obfuscate the truth but not distort it. The way of using narrowed range to convey information is complementary to the popular paradigm of perturbing data. Also, the interval mechanisms can generate progressively refined information at the discretion of individual respondents. We study different theoretical aspects of the proposed privacy. In the context of supervised learning, we also offer a method such that existing supervised learning algorithms designed for point-valued data could be directly applied to learning from interval-valued data.

Cryptography and Security

A Comprehensive Survey on Local Differential Privacy Toward Data Statistics and Analysis

106 - Teng Wang , Xuefeng Zhang , Jingyu Feng 2020

Collecting and analyzing massive data generated from smart devices have become increasingly pervasive in crowdsensing, which are the building blocks for data-driven decision-making. However, extensive statistics and analysis of such data will seriously threaten the privacy of participating users. Local differential privacy (LDP) has been proposed as an excellent and prevalent privacy model with distributed architecture, which can provide strong privacy guarantees for each user while collecting and analyzing data. LDP ensures that each users data is locally perturbed first in the client-side and then sent to the server-side, thereby protecting data from privacy leaks on both the client-side and server-side. This survey presents a comprehensive and systematic overview of LDP with respect to privacy models, research tasks, enabling mechanisms, and various applications. Specifically, we first provide a theoretical summarization of LDP, including the LDP model, the variants of LDP, and the basic framework of LDP algorithms. Then, we investigate and compare the diverse LDP mechanisms for various data statistics and analysis tasks from the perspectives of frequency estimation, mean estimation, and machine learning. Whats more, we also summarize practical LDP-based application scenarios. Finally, we outline several future research directions under LDP.

Cryptography and Security

A Survey on Device Behavior Fingerprinting: Data Sources, Techniques, Application Scenarios, and Datasets

252 - Pedro Miguel Sanchez Sanchez , Jose Maria Jorquera Valero , n Alberto Huertas Celdran 2020

In the current network-based computing world, where the number of interconnected devices grows exponentially, their diversity, malfunctions, and cybersecurity threats are increasing at the same rate. To guarantee the correct functioning and performance of novel environments such as Smart Cities, Industry 4.0, or crowdsensing, it is crucial to identify the capabilities of their devices (e.g., sensors, actuators) and detect potential misbehavior that may arise due to cyberattacks, system faults, or misconfigurations. With this goal in mind, a promising research field emerged focusing on creating and managing fingerprints that model the behavior of both the device actions and its components. The article at hand studies the recent growth of the device behavior fingerprinting field in terms of application scenarios, behavioral sources, and processing and evaluation techniques. First, it performs a comprehensive review of the device types, behavioral data, and processing and evaluation techniques used by the most recent and representative research works dealing with two major scenarios: device identification and device misbehavior detection. After that, each work is deeply analyzed and compared, emphasizing its characteristics, advantages, and limitations. This article also provides researchers with a review of the most relevant characteristics of existing datasets as most of the novel processing techniques are based on machine learning and deep learning. Finally, it studies the evolution of these two scenarios in recent years, providing lessons learned, current trends, and future research challenges to guide new solutions in the area.

Cryptography and Security