No Arabic abstract
With invaluable theoretical and practical benefits, the problem of partitioning networks for community structures has attracted significant research attention in scientific and engineering disciplines. In literature, Newmans modularity measure is routinely applied to quantify the quality of a given partition, and thereby maximizing the measure provides a principled way of detecting communities in networks. Unfortunately, the exact optimization of the measure is computationally NP-complete and only applicable to very small networks. Approximation approaches have to be sought to scale to large networks. To address the computational issue, we proposed a new method to identify the partition decisions. Coupled with an iterative rounding strategy and a fast constrained power method, our work achieves tight and effective spectral relaxations. The proposed method was evaluated thoroughly on both real and synthetic networks. Compared with state-of-the-art approaches, the method obtained comparable, if not better, qualities. Meanwhile, it is highly suitable for parallel execution and reported a nearly linear improvement in running speed when increasing the number of computing nodes, which thereby provides a practical tool for partitioning very large networks.
A distinguishing property of communities in networks is that cycles are more prevalent within communities than across communities. Thus, the detection of these communities may be aided through the incorporation of measures of the local richness of the cyclic structure. In this paper, we introduce renewal non-backtracking random walks (RNBRW) as a way of quantifying this structure. RNBRW gives a weight to each edge equal to the probability that a non-backtracking random walk completes a cycle with that edge. Hence, edges with larger weights may be thought of as more important to the formation of cycles. Of note, since separate random walks can be performed in parallel, RNBRW weights can be estimated very quickly, even for large graphs. We give simulation results showing that pre-weighting edges through RNBRW may substantially improve the performance of common community detection algorithms. Our results suggest that RNBRW is especially efficient for the challenging case of detecting communities in sparse graphs.
We present a network community-detection technique based on properties that emerge from a nature-inspired system of aligning particles. Initially, each vertex is assigned a random-direction unit vector. A nonlinear dynamic law is established so that neighboring vertices try to become aligned with each other. After some time, the system stops and edges that connect the least-aligned pairs of vertices are removed. Then the evolution starts over without the removed edges, and after enough number of removal rounds, each community becomes a connected component. The proposed approach is evaluated using widely-accepted benchmarks and real-world networks. Experimental results reveal that the method is robust and excels on a wide variety of networks. Moreover, for large sparse networks, the edge-removal process runs in quasilinear time, which enables application in large-scale networks.
We study the structure of loops in networks using the notion of modulus of loop families. We introduce a new measure of network clustering by quantifying the richness of families of (simple) loops. Modulus tries to minimize the expected overlap among loops by spreading the expected link-usage optimally. We propose weighting networks using these expected link-usages to improve classical community detection algorithms. We show that the proposed method enhances the performance of certain algorithms, such as spectral partitioning and modularity maximization heuristics, on standard benchmarks.
We apply spectral clustering and multislice modularity optimization to a Los Angeles Police Department field interview card data set. To detect communities (i.e., cohesive groups of vertices), we use both geographic and social information about stops involving street gang members in the LAPD district of Hollenbeck. We then compare the algorithmically detected communities with known gang identifications and argue that discrepancies are due to sparsity of social connections in the data as well as complex underlying sociological factors that blur distinctions between communities.
Community structures are critical towards understanding not only the network topology but also how the network functions. However, how to evaluate the quality of detected community structures is still challenging and remains unsolved. The most widely used metric, normalized mutual information (NMI), was proved to have finite size effect, and its improved form relative normalized mutual information (rNMI) has reverse finite size effect. Corrected normalized mutual information (cNMI) was thus proposed and has neither finite size effect nor reverse finite size effect. However, in this paper we show that cNMI violates the so-called proportionality assumption. In addition, NMI-type metrics have the problem of ignoring importance of small communities. Finally, they cannot be used to evaluate a single community of interest. In this paper, we map the computed community labels to the ground-truth ones through integer linear programming, then use kappa index and F-score to evaluate the detected community structures. Experimental results demonstrate the advantages of our method.