ﻻ يوجد ملخص باللغة العربية
In this paper, we present, to our knowledge, the first known I/O efficient solutions for computing the k-bisimulation partition of a massive directed graph, and performing maintenance of such a partition upon updates to the underlying graph. Ubiquitous in the theory and application of graph data, bisimulation is a robust notion of node equivalence which intuitively groups together nodes in a graph which share fundamental structural features. k-bisimulation is the standard variant of bisimulation where the topological features of nodes are only considered within a local neighborhood of radius $kgeqslant 0$. The I/O cost of our partition construction algorithm is bounded by $O(kcdot mathit{sort}(|et|) + kcdot scan(| t|) + mathit{sort}(| t|))$, while our maintenance algorithms are bounded by $O(kcdot mathit{sort}(|et|) + kcdot mathit{sort}(| t|))$. The space complexity bounds are $O(| t|+|et|)$ and $O(kcdot| t|+kcdot|et|)$, resp. Here, $|et|$ and $| t|$ are the number of disk pages occupied by the input graphs edge set and node set, resp., and $mathit{sort}(n)$ and $mathit{scan}(n)$ are the cost of sorting and scanning, resp., a file occupying $n$ pages in external memory. Empirical analysis on a variety of massive real-world and synthetic graph datasets shows that our algorithms perform efficiently in practice, scaling gracefully as graphs grow in size.
Given a similarity graph between items, correlation clustering (CC) groups similar items together and dissimilar ones apart. One of the most popular CC algorithms is KwikCluster: an algorithm that serially clusters neighborhoods of vertices, and obta
As users become confronted with a deluge of provenance data, dedicated techniques are required to make sense of this kind of information. We present Aggregation by Provenance Types, a provenance graph analysis that is capable of generating provenance
Frequent-pattern mining is a common approach to reveal the valuable hidden trends behind data. However, existing frequent-pattern mining algorithms are designed for DRAM, instead of persistent memories (PMs), which can lead to severe performance and
Recent studies showed that single-machine graph processing systems can be as highly competitive as cluster-based approaches on large-scale problems. While several out-of-core graph processing systems and computation models have been proposed, the hig
This paper shows an application of Bloom and Esiks iteration algebras to model graph data in a graph database query language. About twenty years ago, Buneman et al. developed a graph database query language UnQL on the top of a functional meta-langua