No Arabic abstract
This paper aims to go beyond resilience into the study of security and local-repairability for distributed storage systems (DSS). Security and local-repairability are both important as features of an efficient storage system, and this paper aims to understand the trade-offs between resilience, security, and local-repairability in these systems. In particular, this paper first investigates security in the presence of colluding eavesdroppers, where eavesdroppers are assumed to work together in decoding stored information. Second, the paper focuses on coding schemes that enable optimal local repairs. It further brings these two concepts together, to develop locally repairable coding schemes for DSS that are secure against eavesdroppers. The main results of this paper include: a. An improved bound on the secrecy capacity for minimum storage regenerating codes, b. secure coding schemes that achieve the bound for some special cases, c. a new bound on minimum distance for locally repairable codes, d. code construction for locally repairable codes that attain the minimum distance bound, and e. repair-bandwidth-efficient locally repairable codes with and without security constraints.
In large scale distributed storage systems (DSS) deployed in cloud computing, correlated failures resulting in simultaneous failure (or, unavailability) of blocks of nodes are common. In such scenarios, the stored data or a content of a failed node can only be reconstructed from the available live nodes belonging to the available blocks. To analyze the resilience of the system against such block failures, this work introduces the framework of Block Failure Resilient (BFR) codes, wherein the data (e.g., a file in DSS) can be decoded by reading out from a same number of codeword symbols (nodes) from a subset of available blocks of the underlying codeword. Further, repairable BFR codes are introduced, wherein any codeword symbol in a failed block can be repaired by contacting a subset of remaining blocks in the system. File size bounds for repairable BFR codes are derived, and the trade-off between per node storage and repair bandwidth is analyzed, and the corresponding minimum storage regenerating (BFR-MSR) and minimum bandwidth regenerating (BFR-MBR) points are derived. Explicit codes achieving the two operating points for a special case of parameters are constructed, wherein the underlying regenerating codewords are distributed to BFR codeword symbols according to combinatorial designs. Finally, BFR locally repairable codes (BFR-LRC) are introduced, an upper bound on the resilience is derived and optimal code construction are provided by a concatenation of Gabidulin and MDS codes. Repair efficiency of BFR-LRC is further studied via the use of BFR-MSR/MBR codes as local codes. Code constructions achieving optimal resilience for BFR-MSR/MBR-LRCs are provided for certain parameter regimes. Overall, this work introduces the framework of block failures along with optimal code constructions, and the study of architecture-aware coding for distributed storage systems.
Locally repairable codes with locality $r$ ($r$-LRCs for short) were introduced by Gopalan et al. cite{1} to recover a failed node of the code from at most other $r$ available nodes. And then $(r,delta)$ locally repairable codes ($(r,delta)$-LRCs for short) were produced by Prakash et al. cite{2} for tolerating multiple failed nodes. An $r$-LRC can be viewed as an $(r,2)$-LRC. An $(r,delta)$-LRC is called optimal if it achieves the Singleton-type bound. It has been a great challenge to construct $q$-ary optimal $(r,delta)$-LRCs with length much larger than $q$. Surprisingly, Luo et al. cite{3} presented a construction of $q$-ary optimal $r$-LRCs of minimum distances 3 and 4 with unbounded lengths (i.e., lengths of these codes are independent of $q$) via cyclic codes. In this paper, inspired by the work of cite{3}, we firstly construct two classes of optimal cyclic $(r,delta)$-LRCs with unbounded lengths and minimum distances $delta+1$ or $delta+2$, which generalize the results about the $delta=2$ case given in cite{3}. Secondly, with a slightly stronger condition, we present a construction of optimal cyclic $(r,delta)$-LRCs with unbounded length and larger minimum distance $2delta$. Furthermore, when $delta=3$, we give another class of optimal cyclic $(r,3)$-LRCs with unbounded length and minimum distance $6$.
As an important coding scheme in modern distributed storage systems, locally repairable codes (LRCs) have attracted a lot of attentions from perspectives of both practical applications and theoretical research. As a major topic in the research of LRCs, bounds and constructions of the corresponding optimal codes are of particular concerns. In this work, codes with $(r,delta)$-locality which have optimal minimal distance w.r.t. the bound given by Prakash et al. cite{Prakash2012Optimal} are considered. Through parity check matrix approach, constructions of both optimal $(r,delta)$-LRCs with all symbol locality ($(r,delta)_a$-LRCs) and optimal $(r,delta)$-LRCs with information locality ($(r,delta)_i$-LRCs) are provided. As a generalization of a work of Xing and Yuan cite{XY19}, these constructions are built on a connection between sparse hypergraphs and optimal $(r,delta)$-LRCs. With the help of constructions of large sparse hypergraphs, the length of codes constructed can be super-linear in the alphabet size. This improves upon previous constructions when the minimal distance of the code is at least $3delta+1$. As two applications, optimal H-LRCs with super-linear length and GSD codes with unbounded length are also constructed.
In this work it is shown that locally repairable codes (LRCs) can be list-decoded efficiently beyond the Johnson radius for a large range of parameters by utilizing the local error-correction capabilities. The corresponding decoding radius is derived and the asymptotic behavior is analyzed. A general list-decoding algorithm for LRCs that achieves this radius is proposed along with an explicit realization for LRCs that are subcodes of Reed--Solomon codes (such as, e.g., Tamo--Barg LRCs). Further, a probabilistic algorithm of low complexity for unique decoding of LRCs is given and its success probability is analyzed. The second part of this work considers error decoding of LRCs and partial maximum distance separable (PMDS) codes through interleaved decoding. For a specific class of LRCs the success probability of interleaved decoding is investigated. For PMDS codes, it is shown that there is a wide range of parameters for which interleaved decoding can increase their decoding radius beyond the minimum distance such that the probability of successful decoding approaches $1$ when the code length goes to infinity.
This chapter deals with the topic of designing reliable and efficient codes for the storage and retrieval of large quantities of data over storage devices that are prone to failure. For long, the traditional objective has been one of ensuring reliability against data loss while minimizing storage overhead. More recently, a third concern has surfaced, namely of the need to efficiently recover from the failure of a single storage unit, corresponding to recovery from the erasure of a single code symbol. We explain here, how coding theory has evolved to tackle this fresh challenge.