Community detection in networks without observing edges

77 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Leto Peel

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Till Hoffmann - Leto Peel - Renaud Lambiotte

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We develop a Bayesian hierarchical model to identify communities in networks for which we do not observe the edges directly, but instead observe a series of interdependent signals for each of the nodes. Fitting the model provides an end-to-end community detection algorithm that does not extract information as a sequence of point estimates but propagates uncertainties from the raw data to the community labels. Our approach naturally supports multiscale community detection as well as the selection of an optimal scale using model comparison. We study the properties of the algorithm using synthetic data and apply it to daily returns of constituents of the S&P100 index as well as climate data from US cities.

قيم البحث

301 - Shuhan Yan , Yuting Jia , Xinbing Wang 2021

Analyzing the groups in the network based on same attributes, functions or connections between nodes is a way to understand network information. The task of discovering a series of node groups is called community detection. Generally, two types of in formation can be utilized to fulfill this task, i.e., the link structures and the node attributes. The temporal text network is a special kind of network that contains both sources of information. Typical representatives include online blog networks, the World Wide Web (WWW) and academic citation networks. In this paper, we study the problem of overlapping community detection in temporal text network. By examining 32 large temporal text networks, we find a lot of edges connecting two nodes with no common community and discover that nodes in the same community share similar textual contents. This scenario cannot be quantitatively modeled by practically all existing community detection methods. Motivated by these empirical observations, we propose MAGIC (Model Affiliation Graph with Interacting Communities), a generative model which captures community interactions and considers the information from both link structures and node attributes. Our experiments on 3 types of datasets show that MAGIC achieves large improvements over 4 state-of-the-art methods in terms of 4 widely-used metrics.

الشبكات الاجتماعية والمعلومات التعلم الآلي

The ground truth about metadata and community detection in networks

87 - Leto Peel , Daniel B. Larremore , 2016

Across many scientific domains, there is a common need to automatically extract a simplified view or coarse-graining of how a complex systems components interact. This general task is called community detection in networks and is analogous to searchi ng for clusters in independent vector data. It is common to evaluate the performance of community detection algorithms by their ability to find so-called ground truth communities. This works well in synthetic networks with planted communities because such networks links are formed explicitly based on those known communities. However, there are no planted communities in real world networks. Instead, it is standard practice to treat some observed discrete-valued node attributes, or metadata, as ground truth. Here, we show that metadata are not the same as ground truth, and that treating them as such induces severe theoretical and practical problems. We prove that no algorithm can uniquely solve community detection, and we prove a general No Free Lunch theorem for community detection, which implies that there can be no algorithm that is optimal for all possible community detection tasks. However, community detection remains a powerful tool and node metadata still have value so a careful exploration of their relationship with network structure can yield insights of genuine worth. We illustrate this point by introducing two statistical techniques that can quantify the relationship between metadata and community structure for a broad class of models. We demonstrate these techniques using both synthetic and real-world networks, and for multiple types of metadata and community structure.

الشبكات الاجتماعية والمعلومات تحليل البيانات والإحصاءات والاحتمال الفيزياء والمجتمع

Synwalk -- Community Detection via Random Walk Modelling

121 - Christian Toth , Denis Helic , Bernhard C. Geiger 2021

Complex systems, abstractly represented as networks, are ubiquitous in everyday life. Analyzing and understanding these systems requires, among others, tools for community detection. As no single best community detection algorithm can exist, robustne ss across a wide variety of problem settings is desirable. In this work, we present Synwalk, a random walk-based community detection method. Synwalk builds upon a solid theoretical basis and detects communities by synthesizing the random walk induced by the given network from a class of candidate random walks. We thoroughly validate the effectiveness of our approach on synthetic and empirical networks, respectively, and compare Synwalks performance with the performance of Infomap and Walktrap. Our results indicate that Synwalk performs robustly on networks with varying mixing parameters and degree distributions. We outperform Infomap on networks with high mixing parameter, and Infomap and Walktrap on networks with many small communities and low average degree. Our work has a potential to inspire further development of community detection via synthesis of random walks and we provide concrete ideas for future research.

الشبكات الاجتماعية والمعلومات التعلم الآلي الفيزياء والمجتمع

Online Community Detection for Event Streams on Networks

118 - Guanhua Fang , Owen G. Ward , Tian Zheng 2020

A common goal in network modeling is to uncover the latent community structure present among nodes. For many real-world networks, observed connections consist of events arriving as streams, which are then aggregated to form edges, ignoring the tempor al dynamic component. A natural way to take account of this temporal dynamic component of interactions is to use point processes as the foundation of the network models for community detection. Computational complexity hampers the scalability of such approaches to large sparse networks. To circumvent this challenge, we propose a fast online variational inference algorithm for learning the community structure underlying dynamic event arrivals on a network using continuous-time point process latent network models. We provide regret bounds on the loss function of this procedure, giving theoretical guarantees on performance. The proposed algorithm is illustrated, using both simulation studies and real data, to have comparable performance in terms of community structure in terms of community recovery to non-online variants. Our proposed framework can also be readily modified to incorporate other popular network structures.

الشبكات الاجتماعية والمعلومات التعلم الآلي التعلم الالي

Efficiently inferring community structure in bipartite networks

217 - Daniel B. Larremore , Aaron Clauset , Abigail Z. Jacobs 2014

Bipartite networks are a common type of network data in which there are two types of vertices, and only vertices of different types can be connected. While bipartite networks exhibit community structure like their unipartite counterparts, existing ap proaches to bipartite community detection have drawbacks, including implicit parameter choices, loss of information through one-mode projections, and lack of interpretability. Here we solve the community detection problem for bipartite networks by formulating a bipartite stochastic block model, which explicitly includes vertex type information and may be trivially extended to $k$-partite networks. This bipartite stochastic block model yields a projection-free and statistically principled method for community detection that makes clear assumptions and parameter choices and yields interpretable results. We demonstrate this models ability to efficiently and accurately find community structure in synthetic bipartite networks with known structure and in real-world bipartite networks with unknown structure, and we characterize its performance in practical contexts.

الشبكات الاجتماعية والمعلومات تحليل البيانات والإحصاءات والاحتمال الفيزياء والمجتمع