A source sequence is to be guessed with some fidelity based on a rate-limited description of an observed sequence with which it is correlated. The trade-off between the description rate and the exponential growth rate of the least power mean of the number of guesses is characterized.
We consider the problem of Private Information Retrieval with Private Side Information (PIR-PSI), wherein a user wants to retrieve a file from replication based non-colluding databases by using the prior knowledge of a subset of the files stored on the databases. The PIR-PSI framework ensures that the privacy of the demand and the side information are jointly preserved, thereby finding potential applications when multiple files have to be downloaded spread across different time-instants. Although the capacity of the PIR-PSI setting is known, we observe that the underlying capacity achieving code construction uses Maximum Distance Separable (MDS) codes thereby contributing to high computational complexity when retrieving the demand. Pointing at this drawback of MDS-based PIR-PSI codes, we propose XOR-based PIR-PSI codes for a simple yet non-trivial setting of two non-colluding databases and two side information files at the user. While our codes offer substantial reduction in complexity when compared to MDS based codes, the code-rate marginally falls short of the capacity of the PIR-PSI setting. Nevertheless, we show that our code-rate is strictly higher than that of XOR-based codes for PIR with no side information, thereby implying that our codes can be useful when downloading multiple files in a sequential manner, instead of applying XOR-based PIR codes on each file.
Stationary memoryless sources produce two correlated random sequences $X^n$ and $Y^n$. A guesser seeks to recover $X^n$ in two stages, by first guessing $Y^n$ and then $X^n$. The contributions of this work are twofold: (1) We characterize the least achievable exponential growth rate (in $n$) of any positive $rho$-th moment of the total number of guesses when $Y^n$ is obtained by applying a deterministic function $f$ component-wise to $X^n$. We prove that, depending on $f$, the least exponential growth rate in the two-stage setup is lower than when guessing $X^n$ directly. We further propose a simple Huffman code-based construction of a function $f$ that is a viable candidate for the minimization of the least exponential growth rate in the two-stage guessing setup. (2) We characterize the least achievable exponential growth rate of the $rho$-th moment of the total number of guesses required to recover $X^n$ when Stage 1 need not end with a correct guess of $Y^n$ and without assumptions on the stationary memoryless sources producing $X^n$ and $Y^n$.
The secrecy of a distributed-storage system for passwords is studied. The encoder, Alice, observes a length-n password and describes it using two hints, which she stores in different locations. The legitimate receiver, Bob, observes both hints. In one scenario the requirement is that the expected number of guesses it takes Bob to guess the password approach one as n tends to infinity, and in the other that the expected size of the shortest list that Bob must form to guarantee that it contain the password approach one. The eavesdropper, Eve, sees only one of the hints. Assuming that Alice cannot control which hints Eve observes, the largest normalized (by n) exponent that can be guaranteed for the expected number of guesses it takes Eve to guess the password is characterized for each scenario. Key to the proof are new results on Arikans guessing and Bunte and Lapidoths task-encoding problem; in particular, the paper establishes a close relation between the two problems. A rate-distortion version of the model is also discussed, as is a generalization that allows for Alice to produce {delta} (not necessarily two) hints, for Bob to observe { u} (not necessarily two) of the hints, and for Eve to observe {eta} (not necessarily one) of the hints. The generalized model is robust against {delta} - { u} disk failures.
This letter investigates a new class of index coding problems. One sender broadcasts packets to multiple users, each desiring a subset, by exploiting prior knowledge of linear combinations of packets. We refer to this class of problems as index coding with coded side-information. Our aim is to characterize the minimum index code length that the sender needs to transmit to simultaneously satisfy all user requests. We show that the optimal binary vector index code length is equal to the minimum rank (minrank) of a matrix whose elements consist of the sets of desired packet indices and side- information encoding matrices. This is the natural extension of matrix minrank in the presence of coded side information. Using the derived expression, we propose a greedy randomized algorithm to minimize the rank of the derived matrix.
The capacity of the semideterministic discrete memoryless broadcast channel (SD-BC) with partial message side-information (P-MSI) at the receivers is established. In the setting without a common message, it is shown that P-MSI to the stochastic receiver alone can increase capacity, whereas P-MSI to the deterministic receiver can only increase capacity if also the stochastic receiver has P-MSI. The latter holds only for the setting without a common message: if the encoder also conveys a common message, then P-MSI to the deterministic receiver alone can increase capacity. These capacity results are used to show that feedback from the stochastic receiver can increase the capacity of the SD-BC without P-MSI and the sum-rate capacity of the SD-BC with P-MSI at the deterministic receiver. The link between P-MSI and feedback is a feedback code, which---roughly speaking---turns feedback into P-MSI at the stochastic receiver and hence helps the stochastic receiver mitigate experienced interference. For the case where the stochastic receiver has full MSI (F-MSI) and can thus fully mitigate experienced interference also in the absence of feedback, it is shown that feedback cannot increase capacity.