Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Improved Storage for Efficient Private Information Retrieval

78 0 0.0 ( 0 )

Download Cite

Added by Karim Banawan

Publication date 2019

fields Informatics Engineering

and research's language is English

Authors Karim Banawan - Batuhan Arasli - Sennur Ulukus

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We consider the problem of private information retrieval from $N$ emph{storage-constrained} databases. In this problem, a user wishes to retrieve a single message out of $M$ messages (of size $L$) without revealing any information about the identity of the message to individual databases. Each database stores $mu ML$ symbols, i.e., a $mu$ fraction of the entire library, where $frac{1}{N} leq mu leq 1$. Our goal is to characterize the optimal tradeoff curve for the storage cost (captured by $mu$) and the normalized download cost ($D/L$). We show that the download cost can be reduced by employing a hybrid storage scheme that combines emph{MDS coding} ideas with emph{uncoded partial replication} ideas. When there is no coding, our scheme reduces to Attia-Kumar-Tandon storage scheme, which was initially introduced by Maddah-Ali-Niesen in the context of the caching problem, and when there is no uncoded partial replication, our scheme reduces to Banawan-Ulukus storage scheme; in general, our scheme outperforms both.

rate research

Private Set Intersection: A Multi-Message Symmetric Private Information Retrieval Perspective

91 - Zhusheng Wang , Karim Banawan , Sennur Ulukus 2019

We study the problem of private set intersection (PSI). In this problem, there are two entities $E_i$, for $i=1, 2$, each storing a set $mathcal{P}_i$, whose elements are picked from a finite field $mathbb{F}_K$, on $N_i$ replicated and non-colluding databases. It is required to determine the set intersection $mathcal{P}_1 cap mathcal{P}_2$ without leaking any information about the remaining elements to the other entity with the least amount of downloaded bits. We first show that the PSI problem can be recast as a multi-message symmetric private information retrieval (MM-SPIR) problem. Next, as a stand-alone result, we derive the information-theoretic sum capacity of MM-SPIR, $C_{MM-SPIR}$. We show that with $K$ messages, $N$ databases, and the size of the desired message set $P$, the exact capacity of MM-SPIR is $C_{MM-SPIR} = 1 - frac{1}{N}$ when $P leq K-1$, provided that the entropy of the common randomness $S$ satisfies $H(S) geq frac{P}{N-1}$ per desired symbol. This result implies that there is no gain for MM-SPIR over successive single-message SPIR (SM-SPIR). For the MM-SPIR problem, we present a novel capacity-achieving scheme that builds on the near-optimal scheme of Banawan-Ulukus originally proposed for the multi-message PIR (MM-PIR) problem without database privacy constraints. Surprisingly, our scheme here is exactly optimal for the MM-SPIR problem for any $P$, in contrast to the scheme for the MM-PIR problem, which was proved only to be near-optimal. Our scheme is an alternative to the SM-SPIR scheme of Sun-Jafar. Based on this capacity result for MM-SPIR, and after addressing the added requirements in its conversion to the PSI problem, we show that the optimal download cost for the PSI problem is $minleft{leftlceilfrac{P_1 N_2}{N_2-1}rightrceil, leftlceilfrac{P_2 N_1}{N_1-1}rightrceilright}$, where $P_i$ is the cardinality of set $mathcal{P}_i$

Information Theory Cryptography and Security Databases

Semantic Private Information Retrieval

141 - Sajani Vithana , Karim Banawan , Sennur Ulukus 2020

We investigate the problem of semantic private information retrieval (semantic PIR). In semantic PIR, a user retrieves a message out of $K$ independent messages stored in $N$ replicated and non-colluding databases without revealing the identity of the desired message to any individual database. The messages come with emph{different semantics}, i.e., the messages are allowed to have emph{non-uniform a priori probabilities} denoted by $(p_i>0,: i in [K])$, which are a proxy for their respective popularity of retrieval, and emph{arbitrary message sizes} $(L_i,: i in [K])$. This is a generalization of the classical private information retrieval (PIR) problem, where messages are assumed to have equal a priori probabilities and equal message sizes. We derive the semantic PIR capacity for general $K$, $N$. The results show that the semantic PIR capacity depends on the number of databases $N$, the number of messages $K$, the a priori probability distribution of messages $p_i$, and the message sizes $L_i$. We present two achievable semantic PIR schemes: The first one is a deterministic scheme which is based on message asymmetry. This scheme employs non-uniform subpacketization. The second scheme is probabilistic and is based on choosing one query set out of multiple options at random to retrieve the required message without the need for exponential subpacketization. We derive necessary and sufficient conditions for the semantic PIR capacity to exceed the classical PIR capacity with equal priors and sizes. Our results show that the semantic PIR capacity can be larger than the classical PIR capacity when longer messages have higher popularities. However, when messages are equal-length, the non-uniform priors cannot be exploited to improve the retrieval rate over the classical PIR capacity.

Information Theory Cryptography and Security Signal Processing

Timely Private Information Retrieval

156 - Karim Banawan , Ahmed Arafa , Sennur Ulukus 2021

We introduce the problem of emph{timely} private information retrieval (PIR) from $N$ non-colluding and replicated servers. In this problem, a user desires to retrieve a message out of $M$ messages from the servers, whose contents are continuously updating. The retrieval process should be executed in a timely manner such that no information is leaked about the identity of the message. To assess the timeliness, we use the emph{age of information} (AoI) metric. Interestingly, the timely PIR problem reduces to an AoI minimization subject to PIR constraints under emph{asymmetric traffic}. We explicitly characterize the optimal tradeoff between the PIR rate and the AoI metric (peak AoI or average AoI) for the case of $N=2$, $M=3$. Further, we provide some structural insights on the general problem with arbitrary $N$, $M$.

Information Theory Cryptography and Security Networking and Internet Architecture

On the Storage Cost of Private Information Retrieval

102 - Chao Tian 2019

We consider the fundamental tradeoff between the storage cost and the download cost in private information retrieval systems, without any explicit structural restrictions on the storage codes, such as maximum distance separable codes or uncoded storage. Two novel outer bounds are provided, which have the following implications. When the messages are stored without any redundancy across the databases, the optimal PIR strategy is to download all the messages; on the other hand, for PIR capacity-achieving codes, each database can reduce the storage cost, from storing all the messages, by no more than one message on average. We then focus on the two-message two-database case, and show that a stronger outer bound can be derived through a novel pseudo-message technique. This stronger outer bound suggests that a precise characterization of the storage-download tradeoff may require non-Shannon type inequalities, or at least more sophisticated bounding techniques.

Information Theory Information Theory

Multi-Party Private Set Intersection: An Information-Theoretic Approach

71 - Zhusheng Wang , Karim Banawan , Sennur Ulukus 2020

We investigate the problem of multi-party private set intersection (MP-PSI). In MP-PSI, there are $M$ parties, each storing a data set $mathcal{p}_i$ over $N_i$ replicated and non-colluding databases, and we want to calculate the intersection of the data sets $cap_{i=1}^M mathcal{p}_i$ without leaking any information beyond the set intersection to any of the parties. We consider a specific communication protocol where one of the parties, called the leader party, initiates the MP-PSI protocol by sending queries to the remaining parties which are called client parties. The client parties are not allowed to communicate with each other. We propose an information-theoretic scheme that privately calculates the intersection $cap_{i=1}^M mathcal{p}_i$ with a download cost of $D = min_{t in {1, cdots, M}} sum_{i in {1, cdots M}setminus {t}} leftlceil frac{|mathcal{p}_t|N_i}{N_i-1}rightrceil$. Similar to the 2-party PSI problem, our scheme builds on the connection between the PSI problem and the multi-message symmetric private information retrieval (MM-SPIR) problem. Our scheme is a non-trivial generalization of the 2-party PSI scheme as it needs an intricate design of the shared common randomness. Interestingly, in terms of the download cost, our scheme does not incur any penalty due to the more stringent privacy constraints in the MP-PSI problem compared to the 2-party PSI problem.

Information Theory Cryptography and Security Databases

comments

Fetching comments

Alshahba Private University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Improved Storage for Efficient Private Information Retrieval

Ask ChatGPT about the research

No Arabic abstract

Read More