No Arabic abstract
We present a communication-efficient distributed protocol for computing the Babai point, an approximate nearest point for a random vector ${bf X}inmathbb{R}^n$ in a given lattice. We show that the protocol is optimal in the sense that it minimizes the sum rate when the components of ${bf X}$ are mutually independent. We then investigate the error probability, i.e. the probability that the Babai point does not coincide with the nearest lattice point. In dimensions two and three, this probability is seen to grow with the packing density. For higher dimensions, we use a bound from probability theory to estimate the error probability for some well-known lattices. Our investigations suggest that for uniform distributions, the error probability becomes large with the dimension of the lattice, for lattices with good packing densities. We also consider the case where $mathbf{X}$ is obtained by adding Gaussian noise to a randomly chosen lattice point. In this case, the error probability goes to zero with the lattice dimension when the noise variance is sufficiently small. In such cases, a distributed algorithm for finding the approximate nearest lattice point is sufficient for finding the nearest lattice point.
Distributed computation is a framework used to break down a complex computational task into smaller tasks and distributing them among computational nodes. Erasure correction codes have recently been introduced and have become a popular workaround to the well known ``straggling nodes problem, in particular, by matching linear coding for linear computation tasks. It was observed that decoding tends to amplify the computation ``noise, i.e., the numerical errors at the computation nodes. We propose taking advantage of the case that more nodes return than minimally required. We show how a clever construction of a polynomial code, inspired by recent results on robust frames, can significantly reduce the amplification of noise, and achieves graceful degradation with the number of straggler nodes.
In this paper, we consider the distributed mean estimation problem where the server has access to some side information, e.g., its local computed mean estimation or the received information sent by the distributed clients at the previous iterations. We propose a practical and efficient estimator based on an r-bit Wynzer-Ziv estimator proposed by Mayekar et al., which requires no probabilistic assumption on the data. Unlike Mayekars work which only utilizes side information at the server, our scheme jointly exploits the correlation between clients data and server s side information, and also between data of different clients. We derive an upper bound of the estimation error of the proposed estimator. Based on this upper bound, we provide two algorithms on how to choose input parameters for the estimator. Finally, parameter regions in which our estimator is better than the previous one are characterized.
We consider the problem of finding the closest lattice point to a vector in n-dimensional Euclidean space when each component of the vector is available at a distinct node in a network. Our objectives are (i) minimize the communication cost and (ii) obtain the error probability. The approximate closest lattice point considered here is the one obtained using the nearest-plane (Babai) algorithm. Assuming a triangular special basis for the lattice, we develop communication-efficient protocols for computing the approximate lattice point and determine the communication cost for lattices of dimension n>1. Based on available parameterizations of reduced bases, we determine the error probability of the nearest plane algorithm for two dimensional lattices analytically, and present a computational error estimation algorithm in three dimensions. For dimensions 2 and 3, our results show that the error probability increases with the packing density of the lattice.
Placement delivery arrays for distributed computing (Comp-PDAs) have recently been proposed as a framework to construct universal computing schemes for MapReduce-like systems. In this work, we extend this concept to systems with straggling nodes, i.e., to systems where a subset of the nodes cannot accomplish the assigned map computations in due time. Unlike most previous works that focused on computing linear functions, our results are universal and apply for arbitrary map and reduce functions. Our contributions are as follows. Firstly, we show how to construct a universal coded computing scheme for MapReduce-like systems with straggling nodes from any given Comp-PDA. We also characterize the storage and communication loads of the resulting scheme in terms of the Comp-PDA parameters. Then, we prove an information-theoretic converse bound on the storage-communication (SC) tradeoff achieved by universal computing schemes with straggling nodes. We show that the information-theoretic bound matches the performance achieved by the coded computing schemes with straggling nodes corresponding to the Maddah-Ali and Niesen (MAN) PDAs, i.e., to the Comp-PDAs describing Maddah-Ali and Niesens coded caching scheme. Interestingly, the same Comp-PDAs (the MAN-PDAs) are optimal for any number of straggling nodes, which implies that the map phase of optimal coded computing schemes does not need to be adapted to the number of stragglers in the system. We finally prove that while the points that lie exactly on the fundamental SC tradeoff cannot be achieved with Comp-PDAs that require smaller number of files than the MAN-PDAs, this is possible for some of the points that lie close to the SC tradeoff. For these latter points, the decrease in the requested number of files can be exponential in the number of nodes of the system.
In wireless distributed computing, networked nodes perform intermediate computations over data placed in their memory and exchange these intermediate values to calculate function values. In this paper we consider an asymmetric setting where each node has access to a random subset of the data, i.e., we cannot control the data placement. The paper makes a simple point: we can realize significant benefits if we are allowed to be flexible, and decide which node computes which function, in our system. We make this argument in the case where each function depends on only two of the data messages, as is the case in similarity searches. We establish a percolation in the behavior of the system, where, depending on the amount of observed data, by being flexible, we may need no communication at all.