Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Minimax Rates of Estimating Approximate Differential Privacy

451 0 0.0 ( 0 )

Download Cite

Added by Xiyang Liu

Publication date 2019

fields Informatics Engineering

and research's language is English

Authors Xiyang Liu - Sewoong Oh

Information Theory Information Theory

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Differential privacy has become a widely accepted notion of privacy, leading to the introduction and deployment of numerous privatization mechanisms. However, ensuring the privacy guarantee is an error-prone process, both in designing mechanisms and in implementing those mechanisms. Both types of errors will be greatly reduced, if we have a data-driven approach to verify privacy guarantees, from a black-box access to a mechanism. We pose it as a property estimation problem, and study the fundamental trade-offs involved in the accuracy in estimated privacy guarantees and the number of samples required. We introduce a novel estimator that uses polynomial approximation of a carefully chosen degree to optimally trade-off bias and variance. With $n$ samples, we show that this estimator achieves performance of a straightforward plug-in estimator with $n ln n$ samples, a phenomenon referred to as effective sample size amplification. The minimax optimality of the proposed estimator is proved by comparing it to a matching fundamental lower bound.

rate research

Minimax rates of entropy estimation on large alphabets via best polynomial approximation

443 - Yihong Wu , Pengkun Yang 2014

Consider the problem of estimating the Shannon entropy of a distribution over $k$ elements from $n$ independent samples. We show that the minimax mean-square error is within universal multiplicative constant factors of $$Big(frac{k }{n log k}Big)^2 + frac{log^2 k}{n}$$ if $n$ exceeds a constant factor of $frac{k}{log k}$; otherwise there exists no consistent estimator. This refines the recent result of Valiant-Valiant cite{VV11} that the minimal sample size for consistent entropy estimation scales according to $Theta(frac{k}{log k})$. The apparatus of best polynomial approximation plays a key role in both the construction of optimal estimators and, via a duality argument, the minimax lower bound.

Information Theory Information Theory Statistics Theory

Differential Privacy for Binary Functions via Randomized Graph Colorings

136 - Rafael G. L. DOliveira , Muriel Medard , Parastoo Sadeghi 2021

We present a framework for designing differentially private (DP) mechanisms for binary functions via a graph representation of datasets. Datasets are nodes in the graph and any two neighboring datasets are connected by an edge. The true binary function we want to approximate assigns a value (or true color) to a dataset. Randomized DP mechanisms are then equivalent to randomized colorings of the graph. A key notion we use is that of the boundary of the graph. Any two neighboring datasets assigned a different true color belong to the boundary. Under this framework, we show that fixing the mechanism behavior at the boundary induces a unique optimal mechanism. Moreover, if the mechanism is to have a homogeneous behavior at the boundary, we present a closed expression for the optimal mechanism, which is obtained by means of a emph{pullback} operation on the optimal mechanism of a line graph. For balanced mechanisms, not favoring one binary value over another, the optimal $(epsilon,delta)$-DP mechanism takes a particularly simple form, depending only on the minimum distance to the boundary, on $epsilon$, and on $delta$.

Information Theory Information Theory

Estimating the Lengths of Memory Words

525 - Gusztav Morvai , Benjamin Weiss 2008

For a stationary stochastic process ${X_n}$ with values in some set $A$, a finite word $w in A^K$ is called a memory word if the conditional probability of $X_0$ given the past is constant on the cylinder set defined by $X_{-K}^{-1}=w$. It is a called a minimal memory word if no proper suffix of $w$ is also a memory word. For example in a $K$-step Markov processes all words of length $K$ are memory words but not necessarily minimal. We consider the problem of determining the lengths of the longest minimal memory words and the shortest memory words of an unknown process ${X_n}$ based on sequentially observing the outputs of a single sample ${xi_1,xi_2,...xi_n}$. We will give a universal estimator which converges almost surely to the length of the longest minimal memory word and show that no such universal estimator exists for the length of the shortest memory word. The alphabet $A$ may be finite or countable.

Information Theory Information Theory

Minimax Lower Bounds on Dictionary Learning for Tensor Data

70 - Zahra Shakeri , Waheed U. Bajwa , Anand D. Sarwate 2016

This paper provides fundamental limits on the sample complexity of estimating dictionaries for tensor data. The specific focus of this work is on $K$th-order tensor data and the case where the underlying dictionary can be expressed in terms of $K$ smaller dictionaries. It is assumed the data are generated by linear combinations of these structured dictionary atoms and observed through white Gaussian noise. This work first provides a general lower bound on the minimax risk of dictionary learning for such tensor data and then adapts the proof techniques for specialized results in the case of sparse and sparse-Gaussian linear combinations. The results suggest the sample complexity of dictionary learning for tensor data can be significantly lower than that for unstructured data: for unstructured data it scales linearly with the product of the dictionary dimensions, whereas for tensor-structured data the bound scales linearly with the sum of the product of the dimensions of the (smaller) component dictionaries. A partial converse is provided for the case of 2nd-order tensor data to show that the bounds in this paper can be tight. This involves developing an algorithm for learning highly-structured dictionaries from noisy tensor data. Finally, numerical experiments highlight the advantages associated with explicitly accounting for tensor data structure during dictionary learning.

Information Theory Information Theory

Estimating Sparse Discrete Distributions Under Local Privacy and Communication Constraints

52 - Jayadev Acharya , Peter Kairouz , Yuhan Liu 2020

We consider the problem of estimating sparse discrete distributions under local differential privacy (LDP) and communication constraints. We characterize the sample complexity for sparse estimation under LDP constraints up to a constant factor and the sample complexity under communication constraints up to a logarithmic factor. Our upper bounds under LDP are based on the Hadamard Response, a private coin scheme that requires only one bit of communication per user. Under communication constraints, we propose public coin schemes based on random hashing functions. Our tight lower bounds are based on the recently proposed method of chi squared contractions.

Information Theory Cryptography and Security Data Structures and Algorithms

comments

Fetching comments

Al-Etihad University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Minimax Rates of Estimating Approximate Differential Privacy

Ask ChatGPT about the research

No Arabic abstract

Read More