No Arabic abstract
Hidden community is a new graph-theoretical concept recently proposed [4], in which the authors also propose a meta-approach called HICODE (Hidden Community Detection) for detecting hidden communities. HICODE is demonstrated through experiments that it is able to uncover previously overshadowed weak layers and uncover both weak and strong layers at a higher accuracy. However, the authors provide no theoretical guarantee for the performance. In this work, we focus on the theoretical analysis of HICODE on synthetic two-layer networks, where layers are independent of each other and each layer is generated by stochastic block model. We bridge their gap through two-layer stochastic block model networks in the following aspects: 1) we show that partitions that locally optimize modularity correspond to grounded layers, indicating modularity-optimizing algorithms can detect strong layers; 2) we prove that when reducing found layers, HICODE increases absolute modularities of all unreduced layers, showing its layer reduction step makes weak layers more detectable. Our work builds a solid theoretical base for HICODE, demonstrating that it is promising in uncovering both weak and strong layers of communities in two-layer networks.
We introduce a new paradigm that is important for community detection in the realm of network analysis. Networks contain a set of strong, dominant communities, which interfere with the detection of weak, natural community structure. When most of the members of the weak communities also belong to stronger communities, they are extremely hard to be uncovered. We call the weak communities the hidden community structure. We present a novel approach called HICODE (HIdden COmmunity DEtection) that identifies the hidden community structure as well as the dominant community structure. By weakening the strength of the dominant structure, one can uncover the hidden structure beneath. Likewise, by reducing the strength of the hidden structure, one can more accurately identify the dominant structure. In this way, HICODE tackles both tasks simultaneously. Extensive experiments on real-world networks demonstrate that HICODE outperforms several state-of-the-art community detection methods in uncovering both the dominant and the hidden structure. In the Facebook university social networks, we find multiple non-redundant sets of communities that are strongly associated with residential hall, year of registration or career position of the faculties or students, while the state-of-the-art algorithms mainly locate the dominant ground truth category. In the Due to the difficulty of labeling all ground truth communities in real-world datasets, HICODE provides a promising approach to pinpoint the existing latent communities and uncover communities for which there is no ground truth. Finding this unknown structure is an extremely important community detection problem.
Community detection, aiming to group nodes based on their connections, plays an important role in network analysis, since communities, treated as meta-nodes, allow us to create a large-scale map of a network to simplify its analysis. However, for privacy reasons, we may want to prevent communities from being discovered in certain cases, leading to the topics on community deception. In this paper, we formalize this community detection attack problem in three scales, including global attack (macroscale), target community attack (mesoscale) and target node attack (microscale). We treat this as an optimization problem and further propose a novel Evolutionary Perturbation Attack (EPA) method, where we generate adversarial networks to realize the community detection attack. Numerical experiments validate that our EPA can successfully attack network community algorithms in all three scales, i.e., hide target nodes or communities and further disturb the community structure of the whole network by only changing a small fraction of links. By comparison, our EPA behaves better than a number of baseline attack methods on six synthetic networks and three real-world networks. More interestingly, although our EPA is based on the louvain algorithm, it is also effective on attacking other community detection algorithms, validating its good transferability.
We apply spectral clustering and multislice modularity optimization to a Los Angeles Police Department field interview card data set. To detect communities (i.e., cohesive groups of vertices), we use both geographic and social information about stops involving street gang members in the LAPD district of Hollenbeck. We then compare the algorithmically detected communities with known gang identifications and argue that discrepancies are due to sparsity of social connections in the data as well as complex underlying sociological factors that blur distinctions between communities.
Community structures are critical towards understanding not only the network topology but also how the network functions. However, how to evaluate the quality of detected community structures is still challenging and remains unsolved. The most widely used metric, normalized mutual information (NMI), was proved to have finite size effect, and its improved form relative normalized mutual information (rNMI) has reverse finite size effect. Corrected normalized mutual information (cNMI) was thus proposed and has neither finite size effect nor reverse finite size effect. However, in this paper we show that cNMI violates the so-called proportionality assumption. In addition, NMI-type metrics have the problem of ignoring importance of small communities. Finally, they cannot be used to evaluate a single community of interest. In this paper, we map the computed community labels to the ground-truth ones through integer linear programming, then use kappa index and F-score to evaluate the detected community structures. Experimental results demonstrate the advantages of our method.
We introduce a new conception of community structure, which we refer to as hidden community structure. Hidden community structure refers to a specific type of overlapping community structure, in which the detection of weak, but meaningful, communities is hindered by the presence of stronger communities. We present Hidden Community Detection HICODE, an algorithm template that identifies both the strong, dominant community structure as well as the weaker, hidden community structure in networks. HICODE begins by first applying an existing community detection algorithm to a network, and then removing the structure of the detected communities from the network. In this way, the structure of the weaker communities becomes visible. Through application of HICODE, we demonstrate that a wide variety of real networks from different domains contain many communities that, though meaningful, are not detected by any of the popular community detection algorithms that we consider. Additionally, on both real and synthetic networks containing a hidden ground-truth community structure, HICODE uncovers this structure better than any baseline algorithms that we compared against. For example, on a real network of undergraduate students that can be partitioned either by `Dorm (residence hall) or `Year, we see that HICODE uncovers the weaker `Year communities with a JCRecall score (a recall-based metric that we define in the text) of over 0.7, while the baseline algorithms achieve scores below 0.2.