No Arabic abstract
Recent work has demonstrated that by monitoring the Real Time Bidding (RTB) protocol, one can estimate the monetary worth of different users for the programmatic advertising ecosystem, even when the so-called winning bids are encrypted. In this paper we describe how to implement the above techniques in a practical and privacy preserving manner. Specifically, we study the privacy consequences of reporting back to a centralized server, features that are necessary for estimating the value of encrypted winning bids. We show that by appropriately modulating the granularity of the necessary information and by scrambling the communication channel to the server, one can increase the privacy performance of the system in terms of K-anonymity. Weve implemented the above ideas on a browser extension and disseminated it to some 200 users. Analyzing the results from 6 months of deployment, we show that the average value of users for the programmatic advertising ecosystem has grown more than 75% in the last 3 years.
Machine learning models are increasingly made available to the masses through public query interfaces. Recent academic work has demonstrated that malicious users who can query such models are able to infer sensitive information about records within the training data. Differential privacy can thwart such attacks, but not all models can be readily trained to achieve this guarantee or to achieve it with acceptable utility loss. As a result, if a model is trained without differential privacy guarantee, little is known or can be said about the privacy risk of releasing it. In this work, we investigate and analyze membership attacks to understand why and how they succeed. Based on this understanding, we propose Differential Training Privacy (DTP), an empirical metric to estimate the privacy risk of publishing a classier when methods such as differential privacy cannot be applied. DTP is a measure of a classier with respect to its training dataset, and we show that calculating DTP is efficient in many practical cases. We empirically validate DTP using state-of-the-art machine learning models such as neural networks trained on real-world datasets. Our results show that DTP is highly predictive of the success of membership attacks and therefore reducing DTP also reduces the privacy risk. We advocate for DTP to be used as part of the decision-making process when considering publishing a classifier. To this end, we also suggest adopting the DTP-1 hypothesis: if a classifier has a DTP value above 1, it should not be published.
Online advertising fuels the (seemingly) free internet. However, although users can access most of the web services free of charge, they pay a heavy coston their privacy. They are forced to trust third parties and intermediaries, who not only collect behavioral data but also absorb great amounts of ad revenues. Consequently, more and more users opt out from advertising by resorting to ad blockers, thus costing publishers millions of dollars in lost ad revenues. Albeit there are various privacy-preserving advertising proposals (e.g.,Adnostic, Privad, Brave Ads) from both academia and industry, they all rely on centralized management that users have to blindly trust without being able to audit, while they also fail to guarantee the integrity of the per-formance analytics they provide to advertisers. In this paper, we design and deploy THEMIS, a novel, decentralized and privacy-by-design ad platform that requires zero trust by users. THEMIS (i) provides auditability to its participants, (ii) rewards users for viewing ads, and (iii) allows advertisers to verify the performance and billing reports of their ad campaigns. By leveraging smart contracts and zero-knowledge schemes, we implement a prototype of THEMIS and early performance evaluation results show that it can scale linearly on a multi sidechain setup while it supports more than 51M users on a single-sidechain.
The prevalence of e-commerce has made detailed customers personal information readily accessible to retailers, and this information has been widely used in pricing decisions. When involving personalized information, how to protect the privacy of such information becomes a critical issue in practice. In this paper, we consider a dynamic pricing problem over $T$ time periods with an emph{unknown} demand function of posted price and personalized information. At each time $t$, the retailer observes an arriving customers personal information and offers a price. The customer then makes the purchase decision, which will be utilized by the retailer to learn the underlying demand function. There is potentially a serious privacy concern during this process: a third party agent might infer the personalized information and purchase decisions from price changes from the pricing system. Using the fundamental framework of differential privacy from computer science, we develop a privacy-preserving dynamic pricing policy, which tries to maximize the retailer revenue while avoiding information leakage of individual customers information and purchasing decisions. To this end, we first introduce a notion of emph{anticipating} $(varepsilon, delta)$-differential privacy that is tailored to dynamic pricing problem. Our policy achieves both the privacy guarantee and the performance guarantee in terms of regret. Roughly speaking, for $d$-dimensional personalized information, our algorithm achieves the expected regret at the order of $tilde{O}(varepsilon^{-1} sqrt{d^3 T})$, when the customers information is adversarially chosen. For stochastic personalized information, the regret bound can be further improved to $tilde{O}(sqrt{d^2T} + varepsilon^{-2} d^2)$
Environmental understanding capability of $textit{augmented}$ (AR) and $textit{mixed reality}$ (MR) devices are continuously improving through advances in sensing, computer vision, and machine learning. Various AR/MR applications demonstrate such capabilities i.e. scanning a space using a handheld or head mounted device and capturing a digital representation of the space that are accurate copies of the real space. However, these capabilities impose privacy risks to users: personally identifiable information can leak from captured 3D maps of the sensitive spaces and/or captured sensitive objects within the mapped space. Thus, in this work, we demonstrate how we can leverage 3D object regeneration for preserving privacy of 3D point clouds. That is, we employ an intermediary layer of protection to transform the 3D point cloud before providing it to the third-party applications. Specifically, we use an existing adversarial autoencoder to generate copies of 3D objects where the likeness of the copies from the original can be varied. To test the viability and performance of this method as a privacy preserving mechanism, we use a 3D classifier to classify and identify these transformed point clouds i.e. perform $textit{super}$-class and $textit{intra}$-class classification. To measure the performance of the proposed privacy framework, we define privacy, $Piin[0,1]$, and utility metrics, $Qin[0,1]$, which are desired to be maximized. Experimental evaluation shows that the privacy framework can indeed variably effect the privacy of a 3D object by varying the privilege level $lin[0,1]$ i.e. if a low $l<0.17$ is maintained, $Pi_1,Pi_2>0.4$ is ensured where $Pi_1,Pi_2$ are super- and intra-class privacy. Lastly, the privacy framework can ensure relatively high intra-class privacy and utility i.e. $Pi_2>0.63$ and $Q>0.70$, if the privilege level is kept within the range of $0.17<l<0.25$.
We analyze the value to e-commerce website operators of offering privacy options to users, e.g., of allowing users to opt out of ad targeting. In particular, we assume that site operators have some control over the cost that a privacy option imposes on users and ask when it is to their advantage to make such costs low. We consider both the case of a single site and the case of multiple sites that compete both for users who value privacy highly and for users who value it less. One of our main results in the case of a single site is that, under normally distributed utilities, if a privacy-sensitive user is worth at least $sqrt{2} - 1$ times as much to advertisers as a privacy-insensitive user, the site operator should strive to make the cost of a privacy option as low as possible. In the case of multiple sites, we show how a Prisoners-Dilemma situation can arise: In the equilibrium in which both sites are obliged to offer a privacy option at minimal cost, both sites obtain lower revenue than they would if they colluded and neither offered a privacy option.