No Arabic abstract
Community structure is a typical property of many real-world networks, and has become a key to understand the dynamics of the networked systems. In these networks most nodes apparently lie in a community while there often exists a few nodes straddling several communities. An ideal algorithm for community detection is preferable which can identify the overlapping communities in such networks. To represent an overlapping division we develop a encoding schema composed of two segments, the first one represents a disjoint partition and the second one represents a extension of the partition that allows of multiple memberships. We give a measure for the informativeness of a node, and present an evolutionary method for detecting the overlapping communities in a network.
The conventional notion of community that favors a high ratio of internal edges to outbound edges becomes invalid when each vertex participates in multiple communities. Such a behavior is commonplace in social networks. The significant overlaps among communities make most existing community detection algorithms ineffective. The lack of effective and efficient tools resulted in very few empirical studies on large-scale detection and analyses of overlapping community structure in real social networks. We developed recently a scalable and accurate method called the Partial Community Merger Algorithm (PCMA) with linear complexity and demonstrated its effectiveness by analyzing two online social networks, Sina Weibo and Friendster, with 79.4 and 65.6 million vertices, respectively. Here, we report in-depth analyses of the 2.9 million communities detected by PCMA to uncover their complex overlapping structure. Each community usually overlaps with a significant number of other communities and has far more outbound edges than internal edges. Yet, the communities remain well separated from each other. Most vertices in a community are multi-membership vertices, and they can be at the core or the peripheral. Almost half of the entire network can be accounted for by an extremely dense network of communities, with the communities being the vertices and the overlaps being the edges. The empirical findings ask for rethinking the notion of community, especially the boundary of a community. Realizing that it is how the edges are organized that matters, the f-core is suggested as a suitable concept for overlapping community in social networks. The results shed new light on the understanding of overlapping community.
Degree distribution of nodes, especially a power law degree distribution, has been regarded as one of the most significant structural characteristics of social and information networks. Node degree, however, only discloses the first-order structure of a network. Higher-order structures such as the edge embeddedness and the size of communities may play more important roles in many online social networks. In this paper, we provide empirical evidence on the existence of rich higherorder structural characteristics in online social networks, develop mathematical models to interpret and model these characteristics, and discuss their various applications in practice. In particular, 1) We show that the embeddedness distribution of social links in many social networks has interesting and rich behavior that cannot be captured by well-known network models. We also provide empirical results showing a clear correlation between the embeddedness distribution and the average number of messages communicated between pairs of social network nodes. 2) We formally prove that random k-tree, a recent model for complex networks, has a power law embeddedness distribution, and show empirically that the random k-tree model can be used to capture the rich behavior of higherorder structures we observed in real-world social networks. 3) Going beyond the embeddedness, we show that a variant of the random k-tree model can be used to capture the power law distribution of the size of communities of overlapping cliques discovered recently.
Community or modular structure is considered to be a significant property of large scale real-world graphs such as social or information networks. Detecting influential clusters or communities in these graphs is a problem of considerable interest as it often accounts for the functionality of the system. We aim to provide a thorough exposition of the topic, including the main elements of the problem, a brief introduction of the existing research for both disjoint and overlapping community search, the idea of influential communities, its implications and the current state of the art and finally provide some insight on possible directions for future research.
Identification of communities in complex networks has become an effective means to analysis of complex systems. It has broad applications in diverse areas such as social science, engineering, biology and medicine. Finding communities of nodes and finding communities of links are two popular schemes for network structure analysis. These schemes, however, have inherent drawbacks and are often inadequate to properly capture complex organizational structures in real networks. We introduce a new scheme and effective approach for identifying complex network structures using a mixture of node and link communities, called hybrid node-link communities. A central piece of our approach is a probabilistic model that accommodates node, link and hybrid node-link communities. Our extensive experiments on various real-world networks, including a large protein-protein interaction network and a large semantic association network of commonly used words, illustrated that the scheme for hybrid communities is superior in revealing network characteristics. Moreover, the new approach outperformed the existing methods for finding node or link communities separately.
In transportation, communication, social and other real complex networks, some critical edges act a pivotal part in controlling the flow of information and maintaining the integrity of the structure. Due to the importance of critical edges in theoretical studies and practical applications, the identification of critical edges gradually become a hot topic in current researches. Considering the overlap of communities in the neighborhood of edges, a novel and effective metric named subgraph overlap (SO) is proposed to quantifying the significance of edges. The experimental results show that SO outperforms all benchmarks in identifying critical edges which are crucial in maintaining the integrity of the structure and functions of networks.