أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Qifa Yan

Improved Constructions for Secure Multi-Party Batch Matrix Multiplication

148 - Jinbao Zhu , Qifa Yan , 2021

This paper investigates the problem of Secure Multi-party Batch Matrix Multiplication (SMBMM), where a user aims to compute the pairwise products $mathbf{A}divideontimesmathbf{B}triangleq(mathbf{A}^{(1)}mathbf{B}^{(1)},ldots,mathbf{A}^{(M)}mathbf{B}^ {(M)})$ of two batch of massive matrices $mathbf{A}$ and $mathbf{B}$ that are generated from two sources, through $N$ honest but curious servers which share some common randomness. The matrices $mathbf{A}$ (resp. $mathbf{B}$) must be kept secure from any subset of up to $X_{mathbf{A}}$ (resp. $X_mathbf{B}$) servers even if they collude, and the user must not obtain any information about $(mathbf{A},mathbf{B})$ beyond the products $mathbf{A}divideontimesmathbf{B}$. A novel computation strategy for single secure matrix multiplication problem (i.e., the case $M=1$) is first proposed, and then is generalized to the strategy for SMBMM by means of cross subspace alignment. The SMBMM strategy focuses on the tradeoff between recovery threshold (the number of successful computing servers that the user needs to wait for), system cost (upload cost, the amount of common randomness, and download cost) and system complexity (encoding, computing, and decoding complexities). Notably, compared with the known result by Chen et al., the strategy for the degraded case $X= X_{mathbf{A}}=X_{mathbf{B}}$ achieves better recovery threshold, amount of common randomness, download cost and decoding complexity when $X$ is less than some parameter threshold, while the performance with respect to other measures remain identical.

نظرية المعلومات نظرية المعلومات

Capacity-Achieving Private Information Retrieval Schemes from Uncoded Storage Constrained Servers with Low Sub-packetization

69 - Jinbao Zhu , Qifa Yan , Xiaohu Tang 2021

This paper investigates reducing sub-packetization of capacity-achieving schemes for uncoded Storage Constrained Private Information Retrieval (SC-PIR) systems. In the SC-PIR system, a user aims to retrieve one out of $K$ files from $N$ servers while revealing nothing about its identity to any individual server, in which the $K$ files are stored at the $N$ servers in an uncoded form and each server can store up to $mu K$ equivalent files, where $mu$ is the normalized storage capacity of each server. We first prove that there exists a capacity-achieving SC-PIR scheme for a given storage design if and only if all the packets are stored exactly at $Mtriangleq mu N$ servers for $mu$ such that $M=mu Nin{2,3,ldots,N}$. Then, the optimal sub-packetization for capacity-achieving linear SC-PIR schemes is characterized as the solution to an optimization problem, which is typically hard to solve because of involving indicator functions. Moreover, a new notion of array called Storage Design Array (SDA) is introduced for the SC-PIR system. With any given SDA, an associated capacity-achieving SC-PIR scheme is constructed. Next, the SC-PIR schemes that have equal-size packets are investigated. Furthermore, the optimal equal-size sub-packetization among all capacity-achieving linear SC-PIR schemes characterized by Woolsey et al. is proved to be $frac{N(M-1)}{gcd(N,M)}$. Finally, by allowing unequal size of packets, a greedy SDA construction is proposed, where the sub-packetization of the associated SC-PIR scheme is upper bounded by $frac{N(M-1)}{gcd(N,M)}$. Among all capacity-achieving linear SC-PIR schemes, the sub-packetization is optimal when $min{M,N-M}|N$ or $M=N$, and within a multiplicative gap $frac{min{M,N-M}}{gcd(N,M)}$ of the optimal one otherwise. In particular, for the case $N=dcdot Mpm1$ where $dgeq 2$, another SDA is constructed to obtain lower sub-packetization.

نظرية المعلومات نظرية المعلومات

Robust and Secure Cache-aided Private Linear Function Retrieval from Coded Servers

198 - Qifa Yan , Daniela Tuninetti 2021

This work investigates a system where each user aims to retrieve a scalar linear function of the files of a library, which are Maximum Distance Separable coded and stored at multiple distributed servers. The system needs to guarantee robust decoding in the sense that each user must decode its demanded function with signals received from any subset of servers whose cardinality exceeds a threshold. In addition, (a) the content of the library must be kept secure from a wiretapper who obtains all the signals from the servers;(b) any subset of users together can not obtain any information about the demands of the remaining users; and (c) the users demands must be kept private against all the servers even if they collude. Achievable schemes are derived by modifying existing Placement Delivery Array (PDA) constructions, originally proposed for single-server single-file retrieval coded caching systems without any privacy or security or robustness constraints. It is shown that the PDAs describing the original Maddah-Ali and Niesens coded caching scheme result in a load-memory tradeoff that is optimal to within a constant multiplicative gap, except for the small memory regime when the number of file is smaller than the number of users. As by-products, improved order optimality results are derived for three less restrictive systems in all parameter regimes.

نظرية المعلومات نظرية المعلومات

Symmetric Private Polynomial Computation From Lagrange Encoding

161 - Jinbao Zhu , Qifa Yan , Xiaohu Tang 2020

The problem of $X$-secure $T$-colluding symmetric Private Polynomial Computation (PPC) from coded storage system with $B$ Byzantine and $U$ unresponsive servers is studied in this paper. Specifically, a dataset consisting of $M$ files are stored acro ss $N$ distributed servers according to $(N,K+X)$ Maximum Distance Separable (MDS) codes such that any group of up to $X$ colluding servers can not learn anything about the data files. A user wishes to privately evaluate one out of a set of candidate polynomial functions over the $M$ files from the system, while guaranteeing that any $T$ colluding servers can not learn anything about the identity of the desired function and the user can not learn anything about the $M$ data files more than the desired polynomial function, in the presence of $B$ Byzantine servers that can send arbitrary responses maliciously to confuse the user and $U$ unresponsive servers that will not respond any information at all. Two novel symmetric PPC schemes using Lagrange encoding are proposed. Both the two schemes achieve the same PPC rate $1-frac{G(K+X-1)+T+2B}{N-U}$, secrecy rate $frac{G(K+X-1)+T}{N-(G(K+X-1)+T+2B+U)}$, finite field size and decoding complexity, where $G$ is the maximum degree over all the candidate polynomial functions. Particularly, the first scheme focuses on the general case that the candidate functions are consisted of arbitrary polynomials, and the second scheme restricts the candidate functions to be a finite-dimensional vector space (or sub-space) of polynomials over $mathbb{F}_p$ but requires less upload cost, query complexity and server computation complexity. Remarkably, the PPC setup studied in this paper generalizes all the previous MDS coded PPC setups and the two degraded schemes strictly outperform the best known schemes in terms of (asymptotical) PPC rate, which is the main concern of the PPC schemes.

نظرية المعلومات نظرية المعلومات

Key Superposition Simultaneously Achieves Security and Privacy in Cache-Aided Linear Function Retrieval

218 - Qifa Yan , Daniela Tuninetti 2020

This work investigates the problem of cache-aided content Secure and demand Private Linear Function Retrieval (SP-LFR), where three constraints are imposed on the system:(a) each user is interested in retrieving an arbitrary linear combination of the files in the servers library;(b) the content of the library must be kept secure from a wiretapper who obtains the signal sent by the server; and (c) no colluding subset of users together obtain information about the demands of the remaining users. A procedure is proposed to derive an SP-LFR scheme from a given Placement Delivery Array (PDA), which is known to give coded caching schemes with low subpacketization for systems with neither security nor privacy constraints. This procedure uses the superposition of security keys and privacy keys in both the cache placement and transmitted signal to guarantee content security and demand privacy, respectively. In particular, among all PDA-based SP-LFR schemes, the memory-load pairs achieved by the PDA describing the Maddah-Ali and Niesens scheme are Pareto-optimal and have the lowest subpacketization. Moreover, the achieved load-memory tradeoff is optimal to within a constant multiplicative gap except for the small memory regime (i.e., when the cache size is between 1 and 2) and the number of files is smaller than the number of users. Remarkably, the memory-load tradeoff does not increase compared to the best known schemes that guarantee either only content security in all regimes or only demand privacy in regime mentioned above.

نظرية المعلومات نظرية المعلومات

Fundamental Limits of Caching for Demand Privacy against Colluding Users

86 - Qifa Yan , Daniela Tuninetti 2020

This work investigates the problem of demand privacy against colluding users for shared-link coded caching systems, where no subset of users can learn any information about the demands of the remaining users. The notion of privacy used here is strong er than similar notions adopted in past work and is motivated by the practical need to insure privacy regardless of the file distribution. Two scenarios are considered: Single File Retrieval (SFR) and Linear Function Retrieval (LFR), where in the latter case each user demands an arbitrary linear combination of the files at the server. The main contributions of this paper are a novel achievable scheme for LFR, referred as privacy key scheme, and a new information theoretic converse bound for SFR. Clearly, being SFR a special case of LFR, an achievable scheme for LFR works for SFR as well, and a converse for SFR is a valid converse for LFR as well. By comparing the performance of the achievable scheme with the converse bound derived in this paper (for the small cache size regime) and existing converse bounds without privacy constraints (in the remaining memory regime), the communication load of the privacy key scheme turns out to be optimal to within a constant multiplicative gap in all parameter regimes. Numerical results show that the new privacy key scheme outperforms in some regime known schemes based on the idea of virtual users, which also satisfy the stronger notion of user privacy against colluding users adopted here. Moreover, the privacy key scheme enjoys much lower subpacketization than known schemes based on virtual users.

نظرية المعلومات نظرية المعلومات

Adaptive Gradient Coding

83 - Hankun Cao , Qifa Yan , Xiaohu Tang 2020

This paper focuses on mitigating the impact of stragglers in distributed learning system. Unlike the existing results designed for a fixed number of stragglers, we developed a new scheme called Adaptive Gradient Coding(AGC) with flexible tolerance of various number of stragglers. Our scheme gives an optimal tradeoff between computation load, straggler tolerance and communication cost. In particular, it allows to minimize the communication cost according to the real-time number of stragglers in the practical environments. Implementations on Amazon EC2 clusters using Python with mpi4py package verify the flexibility in several situations.

نظرية المعلومات نظرية المعلومات

A new Capacity-Achieving Private Information Retrieval Scheme with (Almost) Optimal File Length for Coded Servers

120 - Jinbao Zhu , Qifa Yan , Chao Qi 2019

In a distributed storage system, private information retrieval (PIR) guarantees that a user retrieves one file from the system without revealing any information about the identity of its interested file to any individual server. In this paper, we inv estigate $(N,K,M)$ coded sever model of PIR, where each of $M$ files is distributed to the $N$ servers in the form of $(N,K)$ maximum distance separable (MDS) code for some $N>K$ and $M>1$. As a result, we propose a new capacity-achieving $(N,K,M)$ coded linear PIR scheme such that it can be implemented with file length $frac{K(N-K)}{gcd(N,K)}$, which is much smaller than the previous best result $Kbig(frac{N}{gcd(N,K)}big)^{M-1}$. Notably, among all the capacity-achieving coded linear PIR schemes, we show that the file length is optimal if $M>biglfloor frac{K}{gcd(N,K)}-frac{K}{N-K}bigrfloor+1$, and within a multiplicative gap $frac{K}{gcd(N,K)}$ of a lower bound on the minimum file length otherwise.

نظرية المعلومات نظرية المعلومات

A Fundamental Storage-Communication Tradeoff for Distributed Computing with Straggling Nodes

56 - Qifa Yan , Mich`ele Wigger , Sheng Yang 2019

Placement delivery arrays for distributed computing (Comp-PDAs) have recently been proposed as a framework to construct universal computing schemes for MapReduce-like systems. In this work, we extend this concept to systems with straggling nodes, i.e ., to systems where a subset of the nodes cannot accomplish the assigned map computations in due time. Unlike most previous works that focused on computing linear functions, our results are universal and apply for arbitrary map and reduce functions. Our contributions are as follows. Firstly, we show how to construct a universal coded computing scheme for MapReduce-like systems with straggling nodes from any given Comp-PDA. We also characterize the storage and communication loads of the resulting scheme in terms of the Comp-PDA parameters. Then, we prove an information-theoretic converse bound on the storage-communication (SC) tradeoff achieved by universal computing schemes with straggling nodes. We show that the information-theoretic bound matches the performance achieved by the coded computing schemes with straggling nodes corresponding to the Maddah-Ali and Niesen (MAN) PDAs, i.e., to the Comp-PDAs describing Maddah-Ali and Niesens coded caching scheme. Interestingly, the same Comp-PDAs (the MAN-PDAs) are optimal for any number of straggling nodes, which implies that the map phase of optimal coded computing schemes does not need to be adapted to the number of stragglers in the system. We finally prove that while the points that lie exactly on the fundamental SC tradeoff cannot be achieved with Comp-PDAs that require smaller number of files than the MAN-PDAs, this is possible for some of the points that lie close to the SC tradeoff. For these latter points, the decrease in the requested number of files can be exponential in the number of nodes of the system.

نظرية المعلومات نظرية المعلومات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد