No Arabic abstract
Community structures are critical towards understanding not only the network topology but also how the network functions. However, how to evaluate the quality of detected community structures is still challenging and remains unsolved. The most widely used metric, normalized mutual information (NMI), was proved to have finite size effect, and its improved form relative normalized mutual information (rNMI) has reverse finite size effect. Corrected normalized mutual information (cNMI) was thus proposed and has neither finite size effect nor reverse finite size effect. However, in this paper we show that cNMI violates the so-called proportionality assumption. In addition, NMI-type metrics have the problem of ignoring importance of small communities. Finally, they cannot be used to evaluate a single community of interest. In this paper, we map the computed community labels to the ground-truth ones through integer linear programming, then use kappa index and F-score to evaluate the detected community structures. Experimental results demonstrate the advantages of our method.
We apply spectral clustering and multislice modularity optimization to a Los Angeles Police Department field interview card data set. To detect communities (i.e., cohesive groups of vertices), we use both geographic and social information about stops involving street gang members in the LAPD district of Hollenbeck. We then compare the algorithmically detected communities with known gang identifications and argue that discrepancies are due to sparsity of social connections in the data as well as complex underlying sociological factors that blur distinctions between communities.
Community detection is a key task to further understand the function and the structure of complex networks. Therefore, a strategy used to assess this task must be able to avoid biased and incorrect results that might invalidate further analyses or applications that rely on such communities. Two widely used strategies to assess this task are generally known as structural and functional. The structural strategy basically consists in detecting and assessing such communities by using multiple methods and structural metrics. On the other hand, the functional strategy might be used when ground truth data are available to assess the detected communities. However, the evaluation of communities based on such strategies is usually done in experimental configurations that are largely susceptible to biases, a situation that is inherent to algorithms, metrics and network data used in this task. Furthermore, such strategies are not systematically combined in a way that allows for the identification and mitigation of bias in the algorithms, metrics or network data to converge into more consistent results. In this context, the main contribution of this article is an approach that supports a robust quality evaluation when detecting communities in real-world networks. In our approach, we measure the quality of a community by applying the structural and functional strategies, and the combination of both, to obtain different pieces of evidence. Then, we consider the divergences and the consensus among the pieces of evidence to identify and overcome possible sources of bias in community detection algorithms, evaluation metrics, and network data. Experiments conducted with several real and synthetic networks provided results that show the effectiveness of our approach to obtain more consistent conclusions about the quality of the detected communities.
We introduce a new paradigm that is important for community detection in the realm of network analysis. Networks contain a set of strong, dominant communities, which interfere with the detection of weak, natural community structure. When most of the members of the weak communities also belong to stronger communities, they are extremely hard to be uncovered. We call the weak communities the hidden community structure. We present a novel approach called HICODE (HIdden COmmunity DEtection) that identifies the hidden community structure as well as the dominant community structure. By weakening the strength of the dominant structure, one can uncover the hidden structure beneath. Likewise, by reducing the strength of the hidden structure, one can more accurately identify the dominant structure. In this way, HICODE tackles both tasks simultaneously. Extensive experiments on real-world networks demonstrate that HICODE outperforms several state-of-the-art community detection methods in uncovering both the dominant and the hidden structure. In the Facebook university social networks, we find multiple non-redundant sets of communities that are strongly associated with residential hall, year of registration or career position of the faculties or students, while the state-of-the-art algorithms mainly locate the dominant ground truth category. In the Due to the difficulty of labeling all ground truth communities in real-world datasets, HICODE provides a promising approach to pinpoint the existing latent communities and uncover communities for which there is no ground truth. Finding this unknown structure is an extremely important community detection problem.
Community detection, aiming to group nodes based on their connections, plays an important role in network analysis, since communities, treated as meta-nodes, allow us to create a large-scale map of a network to simplify its analysis. However, for privacy reasons, we may want to prevent communities from being discovered in certain cases, leading to the topics on community deception. In this paper, we formalize this community detection attack problem in three scales, including global attack (macroscale), target community attack (mesoscale) and target node attack (microscale). We treat this as an optimization problem and further propose a novel Evolutionary Perturbation Attack (EPA) method, where we generate adversarial networks to realize the community detection attack. Numerical experiments validate that our EPA can successfully attack network community algorithms in all three scales, i.e., hide target nodes or communities and further disturb the community structure of the whole network by only changing a small fraction of links. By comparison, our EPA behaves better than a number of baseline attack methods on six synthetic networks and three real-world networks. More interestingly, although our EPA is based on the louvain algorithm, it is also effective on attacking other community detection algorithms, validating its good transferability.
Heterogeneous networks are networks consisting of different types of nodes and multiple types of edges linking such nodes. While community detection has been extensively developed as a useful technique for analyzing networks that contain only one type of nodes, very few community detection techniques have been developed for heterogeneous networks. In this paper, we propose a modularity based community detection framework for heterogeneous networks. Unlike existing methods, the proposed approach has the flexibility to treat the number of communities as an unknown quantity. We describe a Louvain type maximization method for finding the community structure that maximizes the modularity function. Our simulation results show the advantages of the proposed method over existing methods. Moreover, the proposed modularity function is shown to be consistent under a heterogeneous stochastic blockmodel framework. Analyses of the DBLP four-area dataset and a MovieLens dataset demonstrate the usefulness of the proposed method.