No Arabic abstract
We introduce a general model for the local obfuscation of probability distributions by probabilistic perturbation, e.g., by adding differentially private noise, and investigate its theoretical properties. Specifically, we relax a notion of distribution privacy (DistP) by generalizing it to divergence, and propose local obfuscation mechanisms that provide divergence distribution privacy. To provide f-divergence distribution privacy, we prove that probabilistic perturbation noise should be added proportionally to the Earth movers distance between the probability distributions that we want to make indistinguishable. Furthermore, we introduce a local obfuscation mechanism, which we call a coupling mechanism, that provides divergence distribution privacy while optimizing the utility of obfuscated data by using exact/approximate auxiliary information on the input distributions we want to protect.
We introduce a formal model for the information leakage of probability distributions and define a notion called distribution privacy as the local differential privacy for probability distributions. Roughly, the distribution privacy of a local obfuscation mechanism means that the attacker cannot significantly gain any information on the distribution of the mechanisms input by observing its output. Then we show that existing local mechanisms can hide input distributions in terms of distribution privacy, while deteriorating the utility by adding too much noise. For example, we prove that the Laplace mechanism needs to add a large amount of noise proportionally to the infinite Wasserstein distance between the two distributions we want to make indistinguishable. To improve the tradeoff between distribution privacy and utility, we introduce a local obfuscation mechanism, called a tupling mechanism, that adds random dummy data to the output. Then we apply this mechanism to the protection of user attributes in location based services. By experiments, we demonstrate that the tupling mechanism outperforms popular local mechanisms in terms of attribute obfuscation and service quality.
This paper presents a high-level circuit obfuscation technique to prevent the theft of intellectual property (IP) of integrated circuits. In particular, our technique protects a class of circuits that relies on constant multiplications, such as filters and neural networks, where the constants themselves are the IP to be protected. By making use of decoy constants and a key-based scheme, a reverse engineer adversary at an untrusted foundry is rendered incapable of discerning true constants from decoy constants. The time-multiplexed constant multiplication (TMCM) block of such circuits, which realizes the multiplication of an input variable by a constant at a time, is considered as our case study for obfuscation. Furthermore, two TMCM design architectures are taken into account; an implementation using a multiplier and a multiplierless shift-adds implementation. Optimization methods are also applied to reduce the hardware complexity of these architectures. The well-known satisfiability (SAT) and automatic test pattern generation (ATPG) attacks are used to determine the vulnerability of the obfuscated designs. It is observed that the proposed technique incurs small overheads in area, power, and delay that are comparable to the hardware complexity of prominent logic locking methods. Yet, the advantage of our approach is in the insight that constants -- instead of arbitrary circuit nodes -- become key-protected.
Program obfuscation is an important software protection technique that prevents attackers from revealing the programming logic and design of the software. We introduce translingual obfuscation, a new software obfuscation scheme which makes programs obscure by misusing the unique features of certain programming languages. Translingual obfuscation translates part of a program from its original language to another language which has a different programming paradigm and execution model, thus increasing program complexity and impeding reverse engineering. In this paper, we investigate the feasibility and effectiveness of translingual obfuscation with Prolog, a logic programming language. We implement translingual obfuscation in a tool called BABEL, which can selectively translate C functions into Prolog predicates. By leveraging two important features of the Prolog language, i.e., unification and backtracking, BABEL obfuscates both the data layout and control flow of C programs, making them much more difficult to reverse engineer. Our experiments show that BABEL provides effective and stealthy software obfuscation, while the cost is only modest compared to one of the most popular commercial obfuscators on the market. With BABEL, we verified the feasibility of translingual obfuscation, which we consider to be a promising new direction for software obfuscation.
LDP (Local Differential Privacy) has recently attracted much attention as a metric of data privacy that prevents the inference of personal data from obfuscated data in the local model. However, there are scenarios in which the adversary wants to perform re-identification attacks to link the obfuscated data to users in this model. LDP can cause excessive obfuscation and destroy the utility in these scenarios because it is not designed to directly prevent re-identification. In this paper, we propose a measure of re-identification risks, which we call PIE (Personal Information Entropy). The PIE is designed so that it directly prevents re-identification attacks in the local model. It lower-bounds the lowest possible re-identification error probability (i.e., Bayes error probability) of the adversary. We analyze the relation between LDP and the PIE, and analyze the PIE and utility in distribution estimation for two obfuscation mechanisms providing LDP. Through experiments, we show that when we consider re-identification as a privacy risk, LDP can cause excessive obfuscation and destroy the utility. Then we show that the PIE can be used to guarantee low re-identification risks for the local obfuscation mechanisms while keeping high utility.
Local Differential Privacy (LDP) is popularly used in practice for privacy-preserving data collection. Although existing LDP protocols offer high utility for large user populations (100,000 or more users), they perform poorly in scenarios with small user populations (such as those in the cybersecurity domain) and lack perturbation mechanisms that are effective for both ordinal and non-ordinal item sequences while protecting sequence length and content simultaneously. In this paper, we address the small user population problem by introducing the concept of Condensed Local Differential Privacy (CLDP) as a specialization of LDP, and develop a suite of CLDP protocols that offer desirable statistical utility while preserving privacy. Our protocols support different types of client data, ranging from ordinal data types in finite metric spaces (numeric malware infection statistics), to non-ordinal items (O