No Arabic abstract
In the study of differential privacy, composition theorems (starting with the original paper of Dwork, McSherry, Nissim, and Smith (TCC06)) bound the degradation of privacy when composing several differentially private algorithms. Kairouz, Oh, and Viswanath (ICML15) showed how to compute the optimal bound for composing $k$ arbitrary $(epsilon,delta)$-differentially private algorithms. We characterize the optimal composition for the more general case of $k$ arbitrary $(epsilon_{1},delta_{1}),ldots,(epsilon_{k},delta_{k})$-differentially private algorithms where the privacy parameters may differ for each algorithm in the composition. We show that computing the optimal composition in general is $#$P-complete. Since computing optimal composition exactly is infeasible (unless FP=$#$P), we give an approximation algorithm that computes the composition to arbitrary accuracy in polynomial time. The algorithm is a modification of Dyers dynamic programming approach to approximately counting solutions to knapsack problems (STOC03).
Let $fsubseteq{0,1}^ntimesXi$ be a relation and $g:{0,1}^mto{0,1,*}$ be a promise function. This work investigates the randomised query complexity of the relation $fcirc g^nsubseteq{0,1}^{mcdot n}timesXi$, which can be viewed as one of the most general cases of composition in the query model (letting $g$ be a relation seems to result in a rather unnatural definition of $fcirc g^n$). We show that for every such $f$ and $g$, $$mathcal R(fcirc g^n) in Omega(mathcal R(f)cdotsqrt{mathcal R(g)}),$$ where $mathcal R$ denotes the randomised query complexity. On the other hand, we demonstrate a relation $f_0$ and a promise function $g_0$, such that $mathcal R(f_0)inTheta(sqrt n)$, $mathcal R(g_0)inTheta(n)$ and $mathcal R(f_0circ g_0^n)inTheta(n)$ $-$ that is, our composition statement is tight. To the best of our knowledge, there was no known composition theorem for the randomised query complexity of relations or promise functions (and for the special case of total functions our lower bound gives multiplicative improvement of $sqrt{log n}$).
We initiate a study of the composition properties of interactive differentially private mechanisms. An interactive differentially private mechanism is an algorithm that allows an analyst to adaptively ask queries about a sensitive dataset, with the property that an adversarial analysts view of the interaction is approximately the same regardless of whether or not any individuals data is in the dataset. Previous studies of composition of differential privacy have focused on non-interactive algorithms, but interactive mechanisms are needed to capture many of the intended applications of differential privacy and a number of the important differentially private primitives. We focus on concurrent composition, where an adversary can arbitrarily interleave its queries to several differentially private mechanisms, which may be feasible when differentially private query systems are deployed in practice. We prove that when the interactive mechanisms being composed are pure differentially private, their concurrent composition achieves privacy parameters (with respect to pure or approximate differential privacy) that match the (optimal) composition theorem for noninteractive differential privacy. We also prove a composition theorem for interactive mechanisms that satisfy approximate differential privacy. That bound is weaker than even the basic (suboptimal) composition theorem for noninteractive differential privacy, and we leave closing the gap as a direction for future research, along with understanding concurrent composition for other variants of differential privacy.
We give a fast algorithm to optimally compose privacy guarantees of differentially private (DP) algorithms to arbitrary accuracy. Our method is based on the notion of privacy loss random variables to quantify the privacy loss of DP algorithms. The running time and memory needed for our algorithm to approximate the privacy curve of a DP algorithm composed with itself $k$ times is $tilde{O}(sqrt{k})$. This improves over the best prior method by Koskela et al. (2020) which requires $tilde{Omega}(k^{1.5})$ running time. We demonstrate the utility of our algorithm by accurately computing the privacy loss of DP-SGD algorithm of Abadi et al. (2016) and showing that our algorithm speeds up the privacy computations by a few orders of magnitude compared to prior work, while maintaining similar accuracy.
Given a binary dominance relation on a set of alternatives, a common thread in the social sciences is to identify subsets of alternatives that satisfy certain notions of stability. Examples can be found in areas as diverse as voting theory, game theory, and argumentation theory. Brandt and Fischer [BF08] proved that it is NP-hard to decide whether an alternative is contained in some inclusion-minimal upward or downward covering set. For both problems, we raise this lower bound to the Theta_{2}^{p} level of the polynomial hierarchy and provide a Sigma_{2}^{p} upper bound. Relatedly, we show that a variety of other natural problems regarding minimal or minimum-size covering sets are hard or complete for either of NP, coNP, and Theta_{2}^{p}. An important consequence of our results is that neither minimal upward nor minimal downward covering sets (even when guaranteed to exist) can be computed in polynomial time unless P=NP. This sharply contrasts with Brandt and Fischers result that minimal bidirectional covering sets (i.e., sets that are both minimal upward and minimal downward covering sets) are polynomial-time computable.
We prove two new results about the randomized query complexity of composed functions. First, we show that the randomized composition conjecture is false: there are families of partial Boolean functions $f$ and $g$ such that $R(fcirc g)ll R(f) R(g)$. In fact, we show that the left hand side can be polynomially smaller than the right hand side (though in our construction, both sides are polylogarithmic in the input size of $f$). Second, we show that for all $f$ and $g$, $R(fcirc g)=Omega(mathop{noisyR}(f)cdot R(g))$, where $mathop{noisyR}(f)$ is a measure describing the cost of computing $f$ on noisy oracle inputs. We show that this composition theorem is the strongest possible of its type: for any measure $M(cdot)$ satisfying $R(fcirc g)=Omega(M(f)R(g))$ for all $f$ and $g$, it must hold that $mathop{noisyR}(f)=Omega(M(f))$ for all $f$. We also give a clean characterization of the measure $mathop{noisyR}(f)$: it satisfies $mathop{noisyR}(f)=Theta(R(fcirc gapmaj_n)/R(gapmaj_n))$, where $n$ is the input size of $f$ and $gapmaj_n$ is the $sqrt{n}$-gap majority function on $n$ bits.