A Spectral Algorithm with Additive Clustering for the Recovery of Overlapping Communities in Networks

399 0 0.0 ( 0 )

Download Cite

Added by Emilie Kaufmann

Publication date 2015

fields Mathematical Statistics

and research's language is English

Authors Emilie Kaufmann

Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper presents a novel spectral algorithm with additive clustering designed to identify overlapping communities in networks. The algorithm is based on geometric properties of the spectrum of the expected adjacency matrix in a random graph model that we call stochastic blockmodel with overlap (SBMO). An adaptive version of the algorithm, that does not require the knowledge of the number of hidden communities, is proved to be consistent under the SBMO when the degrees in the graph are (slightly more than) logarithmic. The algorithm is shown to perform well on simulated data and on real-world graphs with known overlapping communities.

rate research

A Robust Spectral Clustering Algorithm for Sub-Gaussian Mixture Models with Outliers

73 - Prateek R. Srivastava , Purnamrita Sarkar , Grani A. Hanasusanto 2019

We consider the problem of clustering datasets in the presence of arbitrary outliers. Traditional clustering algorithms such as k-means and spectral clustering are known to perform poorly for datasets contaminated with even a small number of outliers. In this paper, we develop a provably robust spectral clustering algorithm that applies a simple rounding scheme to denoise a Gaussian kernel matrix built from the data points and uses vanilla spectral clustering to recover the cluster labels of data points. We analyze the performance of our algorithm under the assumption that the good data points are generated from a mixture of sub-gaussians (we term these inliers), while the outlier points can come from any arbitrary probability distribution. For this general class of models, we show that the misclassification error decays at an exponential rate in the signal-to-noise ratio, provided the number of outliers is a small fraction of the inlier points. Surprisingly, this derived error bound matches with the best-known bound for semidefinite programs (SDPs) under the same setting without outliers. We conduct extensive experiments on a variety of simulated and real-world datasets to demonstrate that our algorithm is less sensitive to outliers compared to other state-of-the-art algorithms proposed in the literature.

Machine Learning Machine Learning Statistics Theory

Finding overlapping communities in networks using evolutionary method

606 - Zhan Weihua , Chen Huahui , Guan Jihong 2013

Community structure is a typical property of many real-world networks, and has become a key to understand the dynamics of the networked systems. In these networks most nodes apparently lie in a community while there often exists a few nodes straddling several communities. An ideal algorithm for community detection is preferable which can identify the overlapping communities in such networks. To represent an overlapping division we develop a encoding schema composed of two segments, the first one represents a disjoint partition and the second one represents a extension of the partition that allows of multiple memberships. We give a measure for the informativeness of a node, and present an evolutionary method for detecting the overlapping communities in a network.

Social and Information Networks Physics and Society

A Game-Theoretic Approach for Detection of Overlapping Communities in Dynamic Complex Networks

155 - Elham Havvaei , Narsingh Deo 2016

Complex networks tend to display communities which are groups of nodes cohesively connected among themselves in one group and sparsely connected to the remainder of the network. Detecting such communities is an important computational problem, since it provides an insight into the functionality of networks. Further, investigating community structure in a dynamic network, where the network is subject to change, is even more challenging. This paper presents a game-theoretical technique for detecting community structures in dynamic as well as static complex networks. In our method, each node takes the role of a player that attempts to gain a higher payoff by joining one or more communities or switching between them. The goal of the game is to reveal community structure formed by these players by finding a Nash-equilibrium point among them. To the best of our knowledge, this is the first game-theoretic algorithm which is able to extract overlapping communities from either static or dynamic networks. We present the experimental results illustrating the effectiveness of the proposed method on both synthetic and real-world networks.

Computer Science and Game Theory Computational Complexity Data Structures and Algorithms

Group testing for overlapping communities

344 - Pavlos Nikolopoulos , Sundara Rajan Srinivasavaradhan , Tao Guo 2020

In this paper, we propose algorithms that leverage a known community structure to make group testing more efficient. We consider a population organized in connected communities: each individual participates in one or more communities, and the infection probability of each individual depends on the communities (s)he participates in. Use cases include students who participate in several classes, and workers who share common spaces. Group testing reduces the number of tests needed to identify the infected individuals by pooling diagnostic samples and testing them together. We show that making testing algorithms aware of the community structure, can significantly reduce the number of tests needed both for adaptive and non-adaptive group testing.

Information Theory Information Theory

Uncovering Complex Overlapping Pattern of Communities in Large-scale Social Networks

82 - Elvis H. W. Xu , Pak Ming Hui 2018

The conventional notion of community that favors a high ratio of internal edges to outbound edges becomes invalid when each vertex participates in multiple communities. Such a behavior is commonplace in social networks. The significant overlaps among communities make most existing community detection algorithms ineffective. The lack of effective and efficient tools resulted in very few empirical studies on large-scale detection and analyses of overlapping community structure in real social networks. We developed recently a scalable and accurate method called the Partial Community Merger Algorithm (PCMA) with linear complexity and demonstrated its effectiveness by analyzing two online social networks, Sina Weibo and Friendster, with 79.4 and 65.6 million vertices, respectively. Here, we report in-depth analyses of the 2.9 million communities detected by PCMA to uncover their complex overlapping structure. Each community usually overlaps with a significant number of other communities and has far more outbound edges than internal edges. Yet, the communities remain well separated from each other. Most vertices in a community are multi-membership vertices, and they can be at the core or the peripheral. Almost half of the entire network can be accounted for by an extremely dense network of communities, with the communities being the vertices and the overlaps being the edges. The empirical findings ask for rethinking the notion of community, especially the boundary of a community. Realizing that it is how the edges are organized that matters, the f-core is suggested as a suitable concept for overlapping community in social networks. The results shed new light on the understanding of overlapping community.

Social and Information Networks Physics and Society