An Empirical Study of the Repair Performance of Novel Coding Schemes for Networked Distributed Storage Systems


الملخص بالإنكليزية

Erasure coding techniques are getting integrated in networked distributed storage systems as a way to provide fault-tolerance at the cost of less storage overhead than traditional replication. Redundancy is maintained over time through repair mechanisms, which may entail large network resource overheads. In recent years, several novel codes tailor-made for distributed storage have been proposed to optimize storage overhead and repair, such as Regenerating Codes that minimize the per repair traffic, or Self-Repairing Codes which minimize the number of nodes contacted per repair. Existing studies of these coding techniques are however predominantly theoretical, under the simplifying assumption that only one object is stored. They ignore many practical issues that real systems must address, such as data placement, de/correlation of multiple stored objects, or the competition for limited network resources when multiple objects are repaired simultaneously. This paper empirically studies the repair performance of these novel storage centric codes with respect to classical erasure codes by simulating realistic scenarios and exploring the interplay of code parameters, failure characteristics and data placement with respect to the trade-offs of bandwidth usage and speed of repairs.

تحميل البحث