DFOGraph: An I/O- and Communication-Efficient System for Distributed Fully-out-of-Core Graph Processing


Abstract in English

With the magnitude of graph-structured data continually increasing, graph processing systems that can scale-out and scale-up are needed to handle extreme-scale datasets. While existing distributed out-of-core solutions have made it possible, they suffer from limited performance due to excessive I/O and communication costs. We present DFOGraph, a distributed fully-out-of-core graph processing system that applies and assembles multiple techniques to enable I/O- and communication-efficient processing. DFOGraph builds upon two-level column-oriented partition with adaptive compressed representations to allow fine-grained selective computation and communication, and it only issues necessary disk and network requests. Our evaluation shows DFOGraph achieves performance comparable to GridGraph and FlashGraph (>2.52x and 1.06x) on a single machine and outperforms Chaos and HybridGraph significantly (>12.94x and >10.82x) when scaling out.

Download