No Arabic abstract
A digital goods auction is a type of auction where potential buyers bid the maximal price that they are willing to pay for a certain item, which a seller can produce at a negligible cost and in unlimited quantity. To maximise her benefits, the aim for the seller is to find the optimal sales price, which every buyer whose bid is not lower will pay. For fairness and privacy purposes, buyers may be concerned about protecting the confidentiality of their bids. Secure Multi-Party Computation is a domain of Cryptography that would allow the seller to compute the optimal sales price while guaranteeing that the bids remain secret. Paradoxically, as a function of the buyers bids, the sales price inevitably reveals some private information. Generic frameworks and entropy-based techniques based on Quantitative Information Flow have been developed in order to quantify and restrict those leakages. Due to their combinatorial nature, these techniques do not scale to large input spaces. In this work, we aim at scaling those privacy analyses to large input spaces in the particular case of digital goods auctions. We derive closed-form formulas for the posterior min-entropy of private inputs in two and three-party auctions, which enables us to effectively quantify the information leaks for arbitrarily large input spaces. We also provide supportive experimental evidence that enables us to formulate a conjecture that would allow us to extend our results to any number of parties.
Elaborate protocols in Secure Multi-party Computation enable several participants to compute a public function of their own private inputs while ensuring that no undesired information leaks about the private inputs, and without resorting to any trusted third party. However, the public output of the computation inevitably leaks some information about the private inputs. Recent works have introduced a framework and proposed some techniques for quantifying such information flow. Yet, owing to their complexity, those methods do not scale to practical situations that may involve large input spaces. The main contribution of the work reported here is to formally investigate the information flow captured by the min-entropy in the particular case of secure three-party computations of affine functions in order to make its quantification scalable to realistic scenarios. To this end, we mathematically derive an explicit formula for this entropy under uniform prior beliefs about the inputs. We show that this closed-form expression can be computed in time constant in the inputs sizes and logarithmic in the coefficients of the affine function. Finally, we formulate some theoretical bounds for this privacy leak in the presence of non-uniform prior beliefs.
In this work, we consider the problem of secure multi-party computation (MPC), consisting of $Gamma$ sources, each has access to a large private matrix, $N$ processing nodes or workers, and one data collector or master. The master is interested in the result of a polynomial function of the input matrices. Each source sends a randomized functions of its matrix, called as its share, to each worker. The workers process their shares in interaction with each other, and send some results to the master such that it can derive the final result. There are several constraints: (1) each worker can store a function of each input matrix, with the size of $frac{1}{m}$ fraction of that input matrix, (2) up to $t$ of the workers, for some integer $t$, are adversary and may collude to gain information about the private inputs or can do malicious actions to make the final result incorrect. The objective is to design an MPC scheme with the minimum number the workers, called the recovery threshold, such that the final result is correct, workers learn no information about the input matrices, and the master learns nothing beyond the final result. In this paper, we propose an MPC scheme that achieves the recovery threshold of $3t+2m-1$ workers, which is order-wise less than the recovery threshold of the conventional methods. The challenge in dealing with this set up is that when nodes interact with each other, the malicious messages that adversarial nodes generate propagate through the system, and can mislead the honest nodes. To deal with this challenge, we design some subroutines that can detect erroneous messages, and correct or drop them.
Contextual bandits are online learners that, given an input, select an arm and receive a reward for that arm. They use the reward as a learning signal and aim to maximize the total reward over the inputs. Contextual bandits are commonly used to solve recommendation or ranking problems. This paper considers a learning setting in which multiple parties aim to train a contextual bandit together in a private way: the parties aim to maximize the total reward but do not want to share any of the relevant information they possess with the other parties. Specifically, multiple parties have access to (different) features that may benefit the learner but that cannot be shared with other parties. One of the parties pulls the arm but other parties may not learn which arm was pulled. One party receives the reward but the other parties may not learn the reward value. This paper develops a privacy-preserving multi-party contextual bandit for this learning setting by combining secure multi-party computation with a differentially private mechanism based on epsilon-greedy exploration.
We investigate the problem of multi-party private set intersection (MP-PSI). In MP-PSI, there are $M$ parties, each storing a data set $mathcal{p}_i$ over $N_i$ replicated and non-colluding databases, and we want to calculate the intersection of the data sets $cap_{i=1}^M mathcal{p}_i$ without leaking any information beyond the set intersection to any of the parties. We consider a specific communication protocol where one of the parties, called the leader party, initiates the MP-PSI protocol by sending queries to the remaining parties which are called client parties. The client parties are not allowed to communicate with each other. We propose an information-theoretic scheme that privately calculates the intersection $cap_{i=1}^M mathcal{p}_i$ with a download cost of $D = min_{t in {1, cdots, M}} sum_{i in {1, cdots M}setminus {t}} leftlceil frac{|mathcal{p}_t|N_i}{N_i-1}rightrceil$. Similar to the 2-party PSI problem, our scheme builds on the connection between the PSI problem and the multi-message symmetric private information retrieval (MM-SPIR) problem. Our scheme is a non-trivial generalization of the 2-party PSI scheme as it needs an intricate design of the shared common randomness. Interestingly, in terms of the download cost, our scheme does not incur any penalty due to the more stringent privacy constraints in the MP-PSI problem compared to the 2-party PSI problem.
Recent success of deep neural networks (DNNs) hinges on the availability of large-scale dataset; however, training on such dataset often poses privacy risks for sensitive training information. In this paper, we aim to explore the power of generative models and gradient sparsity, and propose a scalable privacy-preserving generative model DATALENS. Comparing with the standard PATE privacy-preserving framework which allows teachers to vote on one-dimensional predictions, voting on the high dimensional gradient vectors is challenging in terms of privacy preservation. As dimension reduction techniques are required, we need to navigate a delicate tradeoff space between (1) the improvement of privacy preservation and (2) the slowdown of SGD convergence. To tackle this, we take advantage of communication efficient learning and propose a novel noise compression and aggregation approach TOPAGG by combining top-k compression for dimension reduction with a corresponding noise injection mechanism. We theoretically prove that the DATALENS framework guarantees differential privacy for its generated data, and provide analysis on its convergence. To demonstrate the practical usage of DATALENS, we conduct extensive experiments on diverse datasets including MNIST, Fashion-MNIST, and high dimensional CelebA, and we show that, DATALENS significantly outperforms other baseline DP generative models. In addition, we adapt the proposed TOPAGG approach, which is one of the key building blocks in DATALENS, to DP SGD training, and show that it is able to achieve higher utility than the state-of-the-art DP SGD approach in most cases. Our code is publicly available at https://github.com/AI-secure/DataLens.