ترغب بنشر مسار تعليمي؟ اضغط هنا

Finite Sample Differentially Private Confidence Intervals

75   0   0.0 ( 0 )
 نشر من قبل Vishesh Karwa
 تاريخ النشر 2017
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

We study the problem of estimating finite sample confidence intervals of the mean of a normal population under the constraint of differential privacy. We consider both the known and unknown variance cases and construct differentially private algorithms to estimate confidence intervals. Crucially, our algorithms guarantee a finite sample coverage, as opposed to an asymptotic coverage. Unlike most previous differentially private algorithms, we do not require the domain of the samples to be bounded. We also prove lower bounds on the expected size of any differentially private confidence set showing that our the parameters are optimal up to polylogarithmic factors.



قيم البحث

اقرأ أيضاً

One of the most common statistical goals is to estimate a population parameter and quantify uncertainty by constructing a confidence interval. However, the field of differential privacy lacks easy-to-use and general methods for doing so. We partially fill this gap by developing two broadly applicable methods for private confidence-interval construction. The first is based on asymptotics: for two widely used model classes, exponential families and linear regression, a simple private estimator has the same asymptotic normal distribution as the corresponding non-private estimator, so confidence intervals can be constructed using quantiles of the normal distribution. These are computationally cheap and accurate for large data sets, but do not have good coverage for small data sets. The second approach is based on the parametric bootstrap. It applies out of the box to a wide class of private estimators and has good coverage at small sample sizes, but with increased computational cost. Both methods are based on post-processing the private estimator and do not consume additional privacy budget.
163 - Jay Bartroff , Gary Lorden , 2021
We present an efficient method of calculating exact confidence intervals for the hypergeometric parameter. The method inverts minimum-width acceptance intervals after shifting them to make their endpoints nondecreasing while preserving their level. T he resulting set of confidence intervals achieves minimum possible average width, and even in comparison with confidence sets not required to be intervals it attains the minimum possible cardinality most of the time, and always within 1. The method compares favorably with existing methods not only in the size of the intervals but also in the time required to compute them. The available R package hyperMCI implements the proposed method.
Consider a two-treatment, two-period crossover trial, with responses that are continuous random variables. We find a large-sample frequentist 1-alpha confidence interval for the treatment difference that utilizes the uncertain prior information that there is no differential carryover effect.
$f$-DP has recently been proposed as a generalization of classical definitions of differential privacy allowing a lossless analysis of composition, post-processing, and privacy amplification via subsampling. In the setting of $f$-DP, we propose the c oncept canonical noise distribution (CND) which captures whether an additive privacy mechanism is appropriately tailored for a given $f$, and give a construction that produces a CND given an arbitrary tradeoff function $f$. We show that private hypothesis tests are intimately related to CNDs, allowing for the release of private $p$-values at no additional privacy cost as well as the construction of uniformly most powerful (UMP) tests for binary data. We apply our techniques to the problem of difference of proportions testing, and construct a UMP unbiased semi-private test which upper bounds the performance of any DP test. Using this as a benchmark we propose a private test, based on the inversion of characteristic functions, which allows for optimal inference for the two population parameters and is nearly as powerful as the semi-private UMPU. When specialized to the case of $(epsilon,0)$-DP, we show empirically that our proposed test is more powerful than any $(epsilon/sqrt 2)$-DP test and has more accurate type I errors than the classic normal approximation test.
We study the basic operation of set union in the global model of differential privacy. In this problem, we are given a universe $U$ of items, possibly of infinite size, and a database $D$ of users. Each user $i$ contributes a subset $W_i subseteq U$ of items. We want an ($epsilon$,$delta$)-differentially private algorithm which outputs a subset $S subset cup_i W_i$ such that the size of $S$ is as large as possible. The problem arises in countless real world applications; it is particularly ubiquitous in natural language processing (NLP) applications as vocabulary extraction. For example, discovering words, sentences, $n$-grams etc., from private text data belonging to users is an instance of the set union problem. Known algorithms for this problem proceed by collecting a subset of items from each user, taking the union of such subsets, and disclosing the items whose noisy counts fall above a certain threshold. Crucially, in the above process, the contribution of each individual user is always independent of the items held by other users, resulting in a wasteful aggregation process, where some item counts happen to be way above the threshold. We deviate from the above paradigm by allowing users to contribute their items in a $textit{dependent fashion}$, guided by a $textit{policy}$. In this new setting ensuring privacy is significantly delicate. We prove that any policy which has certain $textit{contractive}$ properties would result in a differentially private algorithm. We design two new algorithms, one using Laplace noise and other Gaussian noise, as specific instances of policies satisfying the contractive properties. Our experiments show that the new algorithms significantly outperform previously known mechanisms for the problem.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا