No Arabic abstract
We consider constructing capacity-achieving linear codes with minimum message size for private information retrieval (PIR) from $N$ non-colluding databases, where each message is coded using maximum distance separable (MDS) codes, such that it can be recovered from accessing the contents of any $T$ databases. It is shown that the minimum message size (sometimes also referred to as the sub-packetization factor) is significantly, in fact exponentially, lower than previously believed. More precisely, when $K>T/textbf{gcd}(N,T)$ where $K$ is the total number of messages in the system and $textbf{gcd}(cdot,cdot)$ means the greatest common divisor, we establish, by providing both novel code constructions and a matching converse, the minimum message size as $textbf{lcm}(N-T,T)$, where $textbf{lcm}(cdot,cdot)$ means the least common multiple. On the other hand, when $K$ is small, we show that it is in fact possible to design codes with a message size even smaller than $textbf{lcm}(N-T,T)$.
In a distributed storage system, private information retrieval (PIR) guarantees that a user retrieves one file from the system without revealing any information about the identity of its interested file to any individual server. In this paper, we investigate $(N,K,M)$ coded sever model of PIR, where each of $M$ files is distributed to the $N$ servers in the form of $(N,K)$ maximum distance separable (MDS) code for some $N>K$ and $M>1$. As a result, we propose a new capacity-achieving $(N,K,M)$ coded linear PIR scheme such that it can be implemented with file length $frac{K(N-K)}{gcd(N,K)}$, which is much smaller than the previous best result $Kbig(frac{N}{gcd(N,K)}big)^{M-1}$. Notably, among all the capacity-achieving coded linear PIR schemes, we show that the file length is optimal if $M>biglfloor frac{K}{gcd(N,K)}-frac{K}{N-K}bigrfloor+1$, and within a multiplicative gap $frac{K}{gcd(N,K)}$ of a lower bound on the minimum file length otherwise.
In quantum private information retrieval (QPIR), a user retrieves a classical file from multiple servers by downloading quantum systems without revealing the identity of the file. The QPIR capacity is the maximal achievable ratio of the retrieved file size to the total download size. In this paper, the capacity of QPIR from MDS-coded and colluding servers is studied. Two classes of QPIR, called stabilizer QPIR and dimension squared QPIR induced from classical strongly linear PIR are defined, and the related QPIR capacities are derived. For the non-colluding case, the general QPIR capacity is derived when the number of files goes to infinity. The capacities of symmetric and non-symmetric QPIR with coded and colluding servers are proved to coincide, being double to their classical counterparts. A general statement on the converse bound for QPIR with coded and colluding servers is derived showing that the capacities of stabilizer QPIR and dimension squared QPIR induced from any class of PIR are upper bounded by twice the classical capacity of the respective PIR class. The proposed capacity-achieving scheme combines the star-product scheme by Freij-Hollanti et al. and the stabilizer QPIR scheme by Song et al. by employing (weakly) self-dual Reed--Solomon codes.
This paper investigates reducing sub-packetization of capacity-achieving schemes for uncoded Storage Constrained Private Information Retrieval (SC-PIR) systems. In the SC-PIR system, a user aims to retrieve one out of $K$ files from $N$ servers while revealing nothing about its identity to any individual server, in which the $K$ files are stored at the $N$ servers in an uncoded form and each server can store up to $mu K$ equivalent files, where $mu$ is the normalized storage capacity of each server. We first prove that there exists a capacity-achieving SC-PIR scheme for a given storage design if and only if all the packets are stored exactly at $Mtriangleq mu N$ servers for $mu$ such that $M=mu Nin{2,3,ldots,N}$. Then, the optimal sub-packetization for capacity-achieving linear SC-PIR schemes is characterized as the solution to an optimization problem, which is typically hard to solve because of involving indicator functions. Moreover, a new notion of array called Storage Design Array (SDA) is introduced for the SC-PIR system. With any given SDA, an associated capacity-achieving SC-PIR scheme is constructed. Next, the SC-PIR schemes that have equal-size packets are investigated. Furthermore, the optimal equal-size sub-packetization among all capacity-achieving linear SC-PIR schemes characterized by Woolsey et al. is proved to be $frac{N(M-1)}{gcd(N,M)}$. Finally, by allowing unequal size of packets, a greedy SDA construction is proposed, where the sub-packetization of the associated SC-PIR scheme is upper bounded by $frac{N(M-1)}{gcd(N,M)}$. Among all capacity-achieving linear SC-PIR schemes, the sub-packetization is optimal when $min{M,N-M}|N$ or $M=N$, and within a multiplicative gap $frac{min{M,N-M}}{gcd(N,M)}$ of the optimal one otherwise. In particular, for the case $N=dcdot Mpm1$ where $dgeq 2$, another SDA is constructed to obtain lower sub-packetization.
The notion of a Private Information Retrieval (PIR) code was recently introduced by Fazeli, Vardy and Yaakobi who showed that this class of codes permit PIR at reduced levels of storage overhead in comparison with replicated-server PIR. In the present paper, the construction of an $(n,k)$ $tau$-server binary, linear PIR code having parameters $n = sumlimits_{i = 0}^{ell} {m choose i}$, $k = {m choose ell}$ and $tau = 2^{ell}$ is presented. These codes are obtained through homogeneous-polynomial evaluation and correspond to the binary, Projective Reed Muller (PRM) code. The construction can be extended to yield PIR codes for any $tau$ of the form $2^{ell}$, $2^{ell}-1$ and any value of $k$, through a combination of single-symbol puncturing and shortening of the PRM code. Each of these code constructions above, have smaller storage overhead in comparison with other PIR codes appearing in the literature. For the particular case of $tau=3,4$, we show that the codes constructed here are optimal, systematic PIR codes by providing an improved lower bound on the block length $n(k, tau)$ of a systematic PIR code. It follows from a result by Vardy and Yaakobi, that these codes also yield optimal, systematic primitive multi-set $(n, k, tau)_B$ batch codes for $tau=3,4$. The PIR code constructions presented here also yield upper bounds on the generalized Hamming weights of binary PRM codes.
We consider the problem of Private Information Retrieval with Private Side Information (PIR-PSI), wherein a user wants to retrieve a file from replication based non-colluding databases by using the prior knowledge of a subset of the files stored on the databases. The PIR-PSI framework ensures that the privacy of the demand and the side information are jointly preserved, thereby finding potential applications when multiple files have to be downloaded spread across different time-instants. Although the capacity of the PIR-PSI setting is known, we observe that the underlying capacity achieving code construction uses Maximum Distance Separable (MDS) codes thereby contributing to high computational complexity when retrieving the demand. Pointing at this drawback of MDS-based PIR-PSI codes, we propose XOR-based PIR-PSI codes for a simple yet non-trivial setting of two non-colluding databases and two side information files at the user. While our codes offer substantial reduction in complexity when compared to MDS based codes, the code-rate marginally falls short of the capacity of the PIR-PSI setting. Nevertheless, we show that our code-rate is strictly higher than that of XOR-based codes for PIR with no side information, thereby implying that our codes can be useful when downloading multiple files in a sequential manner, instead of applying XOR-based PIR codes on each file.