Repair Strategies for Storage on Mobile Clouds


Abstract in English

We study the data reliability problem for a community of devices forming a mobile cloud storage system. We consider the application of regenerating codes for file maintenance within a geographically-limited area. Such codes require lower bandwidth to regenerate lost data fragments compared to file replication or reconstruction. We investigate threshold-based repair strategies where data repair is initiated after a threshold number of data fragments have been lost due to node mobility. We show that at a low departure-to-repair rate regime, a lazy repair strategy in which repairs are initiated after several nodes have left the system outperforms eager repair in which repairs are initiated after a single departure. This optimality is reversed when nodes are highly mobile. We further compare distributed and centralized repair strategies and derive the optimal repair threshold for minimizing the average repair cost per unit of time, as a function of underlying code parameters. In addition, we examine cooperative repair strategies and show performance improvements compared to non-cooperative codes. We investigate several models for the time needed for node repair including a simple fixed time model that allows for the computation of closed-form expressions and a more realistic model that takes into account the number of repaired nodes. We derive the conditions under which the former model approximates the latter. Finally, an extended model where additional failures are allowed during the repair process is investigated. Overall, our results establish the joint effect of code design and repair algorithms on the maintenance cost of distributed storage systems.

Download