ﻻ يوجد ملخص باللغة العربية
We study the problem of approximating the $3$-profile of a large graph. $3$-profiles are generalizations of triangle counts that specify the number of times a small graph appears as an induced subgraph of a large graph. Our algorithm uses the novel concept of $3$-profile sparsifiers: sparse graphs that can be used to approximate the full $3$-profile counts for a given large graph. Further, we study the problem of estimating local and ego $3$-profiles, two graph quantities that characterize the local neighborhood of each vertex of a graph. Our algorithm is distributed and operates as a vertex program over the GraphLab PowerGraph framework. We introduce the concept of edge pivoting which allows us to collect $2$-hop information without maintaining an explicit $2$-hop neighborhood list at each vertex. This enables the computation of all the local $3$-profiles in parallel with minimal communication. We test out implementation in several experiments scaling up to $640$ cores on Amazon EC2. We find that our algorithm can estimate the $3$-profile of a graph in approximately the same time as triangle counting. For the harder problem of ego $3$-profiles, we introduce an algorithm that can estimate profiles of hundreds of thousands of vertices in parallel, in the timescale of minutes.
We present a novel distributed algorithm for counting all four-node induced subgraphs in a big graph. These counts, called the $4$-profile, describe a graphs connectivity properties and have found several uses ranging from bioinformatics to spam dete
Graphs and networks are used to model interactions in a variety of contexts. There is a growing need to quickly assess the characteristics of a graph in order to understand its underlying structure. Some of the most useful metrics are triangle-based
This paper is concerned with efficiently coloring sparse graphs in the distributed setting with as few colors as possible. According to the celebrated Four Color Theorem, planar graphs can be colored with at most 4 colors, and the proof gives a (sequ
Real-world complex networks describe connections between objects; in reality, those objects are often endowed with some kind of features. How does the presence or absence of such features interplay with the network link structure? Although the situat
Embedding networks into a fixed dimensional feature space, while preserving its essential structural properties is a fundamental task in graph analytics. These feature vectors (graph descriptors) are used to measure the pairwise similarity between gr