No Arabic abstract
Distributed computation is a framework used to break down a complex computational task into smaller tasks and distributing them among computational nodes. Erasure correction codes have recently been introduced and have become a popular workaround to the well known ``straggling nodes problem, in particular, by matching linear coding for linear computation tasks. It was observed that decoding tends to amplify the computation ``noise, i.e., the numerical errors at the computation nodes. We propose taking advantage of the case that more nodes return than minimally required. We show how a clever construction of a polynomial code, inspired by recent results on robust frames, can significantly reduce the amplification of noise, and achieves graceful degradation with the number of straggler nodes.
This chapter deals with the topic of designing reliable and efficient codes for the storage and retrieval of large quantities of data over storage devices that are prone to failure. For long, the traditional objective has been one of ensuring reliability against data loss while minimizing storage overhead. More recently, a third concern has surfaced, namely of the need to efficiently recover from the failure of a single storage unit, corresponding to recovery from the erasure of a single code symbol. We explain here, how coding theory has evolved to tackle this fresh challenge.
Non-orthogonal multiple-access (NOMA) is a leading technology which gain a lot of interest this past several years. It enables larger user density and therefore is suited for modern systems such as 5G and IoT. In this paper we examined different frame-based codes for a partially active NOMA system. It is a more realistic setting where only part of the users, in an overly populated system, are active simultaneously. We introduce a new analysis approach were the active user ratio, a systems feature, is kept constant and different sized frames are employed. The frame types were partially derived from previous papers on the subject [1][2] and partially novel such as the LPF and the Steiner ETF. We learned the best capacity achieving frame depends on the active user ratio and three distinct ranges where defined. In addition, we introduced a measure called practical capacity which is the maximal rate achieved by simple coding scheme. ETFs always achieve the best practical capacity while LPFs and sparse frames are worse than a random one.
We consider a MapReduce-type task running in a distributed computing model which consists of ${K}$ edge computing nodes distributed across the edge of the network and a Master node that assists the edge nodes to compute output functions. The Master node and the edge nodes, both equipped with some storage memories and computing capabilities, are connected through a multicast network. We define the communication time spent during the transmission for the sequential implementation (all nodes send symbols sequentially) and parallel implementation (the Master node can send symbols during the edge nodes transmission), respectively. We propose a mixed coded distributed computing scheme that divides the system into two subsystems where the coded distributed computing (CDC) strategy proposed by Songze Li emph{et al.} is applied into the first subsystem and a novel master-aided CDC strategy is applied into the second subsystem. We prove that this scheme is optimal, i.e., achieves the minimum communication time for both the sequential and parallel implementation, and establish an {emph{optimal}} information-theoretic tradeoff between the overall communication time, computation load, and the Master nodes storage capacity. It demonstrates that incorporating a Master node with storage and computing capabilities can further reduce the communication time. For the sequential implementation, we deduce the approximately optimal file allocation between the two subsystems, which shows that the Master node should map as many files as possible in order to achieve smaller communication time. For the parallel implementation, if the Master nodes storage and computing capabilities are sufficiently large (not necessary to store and map all files), then the proposed scheme requires at most 1/2 of the minimum communication time of system without the help of the Master node.
Coded computation can be used to speed up distributed learning in the presence of straggling workers. Partial recovery of the gradient vector can further reduce the computation time at each iteration; however, this can result in biased estimators, which may slow down convergence, or even cause divergence. Estimator bias will be particularly prevalent when the straggling behavior is correlated over time, which results in the gradient estimators being dominated by a few fast servers. To mitigate biased estimators, we design a $timely$ dynamic encoding framework for partial recovery that includes an ordering operator that changes the codewords and computation orders at workers over time. To regulate the recovery frequencies, we adopt an $age$ metric in the design of the dynamic encoding scheme. We show through numerical results that the proposed dynamic encoding strategy increases the timeliness of the recovered computations, which as a result, reduces the bias in model updates, and accelerates the convergence compared to the conventional static partial recovery schemes.
We present a communication-efficient distributed protocol for computing the Babai point, an approximate nearest point for a random vector ${bf X}inmathbb{R}^n$ in a given lattice. We show that the protocol is optimal in the sense that it minimizes the sum rate when the components of ${bf X}$ are mutually independent. We then investigate the error probability, i.e. the probability that the Babai point does not coincide with the nearest lattice point. In dimensions two and three, this probability is seen to grow with the packing density. For higher dimensions, we use a bound from probability theory to estimate the error probability for some well-known lattices. Our investigations suggest that for uniform distributions, the error probability becomes large with the dimension of the lattice, for lattices with good packing densities. We also consider the case where $mathbf{X}$ is obtained by adding Gaussian noise to a randomly chosen lattice point. In this case, the error probability goes to zero with the lattice dimension when the noise variance is sufficiently small. In such cases, a distributed algorithm for finding the approximate nearest lattice point is sufficient for finding the nearest lattice point.