أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Frederique Oggier

Compressed Differential Erasure Codes for Efficient Archival of Versioned Data

82 - J. Harshan , Anwitaman Datta , Frederique Oggier 2015

In this paper, we study the problem of storing an archive of versioned data in a reliable and efficient manner in distributed storage systems. We propose a new storage technique called differential erasure coding (DEC) where the differences (deltas) between subseque

نظرية المعلومات النظم الموزعة والتوازية والحوسبة العنقودية نظرية المعلومات

Sparsity Exploiting Erasure Coding for Resilient Storage and Efficient I/O Access in Delta based Versioning Systems

83 - J. Harshan , Frederique Oggier , Anwitaman Datta 2014

In this paper we study the problem of storing reliably an archive of versioned data. Specifically, we focus on systems where the differences (deltas) between subseque

نظرية المعلومات النظم الموزعة والتوازية والحوسبة العنقودية نظرية المعلومات

RapidRAID: Pipelined Erasure Codes for Fast Data Archival in Distributed Storage Systems

101 - Lluis Pamies-Juarez , Anwitaman Datta , Frederique Oggier 2012

To achieve reliability in distributed storage systems, data has usually been replicated across different nodes. However the increasing volume of data to be stored has motivated the introduction of erasure codes, a storage efficient alternative to rep lication, particularly suited for archival in data centers, where old datasets (rarely accessed) can be erasure encoded, while replicas are maintained only for the latest data. Many recent works consider the design of new storage-centric erasure codes for improved repairability. In contrast, this paper addresses the migration from replication to encoding: traditionally erasure coding is an atomic operation in that a single node with the whole object encodes and uploads all the encoded pieces. Although large datasets can be concurrently archived by distributing individual object encodings among different nodes, the network and computing capacity of individual nodes constrain the archival process due to such atomicity. We propose a new pipelined coding strategy that distributes the network and computing load of single-object encodings among different nodes, which also speeds up multiple object archival. We further present RapidRAID codes, an explicit family of pipelined erasure codes which provides fast archival without compromising either data reliability or storage overheads. Finally, we provide a real implementation of RapidRAID codes and benchmark its performance using both a cluster of 50 nodes and a set of Amazon EC2 instances. Experiments show that RapidRAID codes reduce a single objects coding time by up to 90%, while when multiple objects are encoded concurrently, the reduction is up to 20%.

النظم الموزعة والتوازية والحوسبة العنقودية

An Empirical Study of the Repair Performance of Novel Coding Schemes for Networked Distributed Storage Systems

64 - Lluis Pamies-Juarez , Frederique Oggier , Anwitaman Datta 2012

Erasure coding techniques are getting integrated in networked distributed storage systems as a way to provide fault-tolerance at the cost of less storage overhead than traditional replication. Redundancy is maintained over time through repair mechani sms, which may entail large network resource overheads. In recent years, several novel codes tailor-made for distributed storage have been proposed to optimize storage overhead and repair, such as Regenerating Codes that minimize the per repair traffic, or Self-Repairing Codes which minimize the number of nodes contacted per repair. Existing studies of these coding techniques are however predominantly theoretical, under the simplifying assumption that only one object is stored. They ignore many practical issues that real systems must address, such as data placement, de/correlation of multiple stored objects, or the competition for limited network resources when multiple objects are repaired simultaneously. This paper empirically studies the repair performance of these novel storage centric codes with respect to classical erasure codes by simulating realistic scenarios and exploring the interplay of code parameters, failure characteristics and data placement with respect to the trade-offs of bandwidth usage and speed of repairs.

النظم الموزعة والتوازية والحوسبة العنقودية

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد