Mitigating the Performance-Efficiency Tradeoff in Resilient Memory Disaggregation


Abstract in English

We present the design and implementation of a low-latency, low-overhead, and highly available resilient disaggregated cluster memory. Our proposed framework can access erasure-coded remote memory within a single-digit {mu}s read/write latency, significantly improving the performance-efficiency tradeoff over the state-of-the-art - it performs similar to in-memory replication with 1.6x lower memory overhead. We also propose a novel coding group placement algorithm for erasure-coded data, that provides load balancing while reducing the probability of data loss under correlated failures by an order of magnitude.

Download