Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Information-theoretic thresholds for community detection in sparse networks

88 0 0.0 ( 0 )

Download Cite

Added by Jess Banks

Publication date 2016

fields Physics

and research's language is English

Authors Jess Banks - Cristopher Moore - Joe Neeman

Probability Statistical Mechanics Computational Complexity

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We give upper and lower bounds on the information-theoretic threshold for community detection in the stochastic block model. Specifically, consider the symmetric stochastic block model with $q$ groups, average degree $d$, and connection probabilities $c_text{in}/n$ and $c_text{out}/n$ for within-group and between-group edges respectively; let $lambda = (c_text{in}-c_text{out})/(qd)$. We show that, when $q$ is large, and $lambda = O(1/q)$, the critical value of $d$ at which community detection becomes possible---in physical terms, the condensation threshold---is [ d_text{c} = Theta!left( frac{log q}{q lambda^2} right) , , ] with tighter results in certain regimes. Above this threshold, we show that any partition of the nodes into $q$ groups which is as `good as the planted one, in terms of the number of within- and between-group edges, is correlated with it. This gives an exponential-time algorithm that performs better than chance; specifically, community detection becomes possible below the Kesten-Stigum bound for $q ge 5$ in the disassortative case $lambda < 0$, and for $q ge 11$ in the assortative case $lambda >0$ (similar upper bounds were obtained independently by Abbe and Sandon). Conversely, below this threshold, we show that no algorithm can label the vertices better than chance, or even distinguish the block model from an ER random graph with high probability. Our lower bound on $d_text{c}$ uses Robinson and Wormalds small subgraph conditioning method, and we also give (less explicit) results for non-symmetric stochastic block models. In the symmetric case, we obtain explicit results by using bounds on certain functions of doubly stochastic matrices due to Achlioptas and Naor; indeed, our lower bound on $d_text{c}$ is their second moment lower bound on the $q$-colorability threshold for random graphs with a certain effective degree.

rate research

Information-theoretic thresholds for community detection in sparse networks

55 - Jess Banks , Cristopher Moore 2016

We give upper and lower bounds on the information-theoretic threshold for community detection in the stochastic block model. Specifically, let $k$ be the number of groups, $d$ be the average degree, the probability of edges between vertices within and between groups be $c_mathrm{in}/n$ and $c_mathrm{out}/n$ respectively, and let $lambda = (c_mathrm{in}-c_mathrm{out})/(kd)$. We show that, when $k$ is large, and $lambda = O(1/k)$, the critical value of $d$ at which community detection becomes possible -- in physical terms, the condensation threshold -- is [ d_c = Theta!left( frac{log k}{k lambda^2} right) , , ] with tighter results in certain regimes. Above this threshold, we show that the only partitions of the nodes into $k$ groups are correlated with the ground truth, giving an exponential-time algorithm that performs better than chance -- in particular, detection is possible for $k ge 5$ in the disassortative case $lambda < 0$ and for $k ge 11$ in the assortative case $lambda > 0$. (Similar upper bounds were obtained independently by Abbe and Sandon.) Below this threshold, we use recent results of Neeman and Netrapalli (who generalized arguments of Mossel, Neeman, and Sly) to show that no algorithm can label the vertices better than chance, or even distinguish the block model from an ErdH{o}s-Renyi random graph with high probability. We also rely on bounds on certain functions of doubly stochastic matrices due to Achlioptas and Naor; indeed, our lower bound on $d_c$ is the second moment lower bound on the $k$-colorability threshold for random graphs with a certain effective degree.

Probability Statistical Mechanics Computational Complexity

Community detection in the sparse hypergraph stochastic block model

119 - Soumik Pal , Yizhe Zhu 2019

We consider the community detection problem in sparse random hypergraphs. Angelini et al. (2015) conjectured the existence of a sharp threshold on model parameters for community detection in sparse hypergraphs generated by a hypergraph stochastic block model. We solve the positive part of the conjecture for the case of two blocks: above the threshold, there is a spectral algorithm which asymptotically almost surely constructs a partition of the hypergraph correlated with the true partition. Our method is a generalization to random hypergraphs of the method developed by Massouli{e} (2014) for sparse random graphs.

Probability Machine Learning Social and Information Networks

Information-theoretic and algorithmic thresholds for group testing

130 - Amin Coja-Oghlan , Oliver Gebhard , Max Hahn-Klimroth 2019

In the group testing problem we aim to identify a small number of infected individuals within a large population. We avail ourselves to a procedure that can test a group of multiple individuals, with the test result coming out positive iff at least one individual in the group is infected. With all tests conducted in parallel, what is the least number of tests required to identify the status of all individuals? In a recent test design [Aldridge et al. 2016] the individuals are assigned to test groups randomly, with every individual joining an equal number of groups. We pinpoint the sharp threshold for the number of tests required in this randomised design so that it is information-theoretically possible to infer the infection status of every individual. Moreover, we analyse two efficient inference algorithms. These results settle conjectures from [Aldridge et al. 2014, Johnson et al. 2019].

Discrete Mathematics Information Theory Information Theory

Information Theoretic Limits of Exact Recovery in Sub-hypergraph Models for Community Detection

84 - Jiajun Liang , Chuyang Ke , Jean Honorio 2021

In this paper, we study the information theoretic bounds for exact recovery in sub-hypergraph models for community detection. We define a general model called the $m-$uniform sub-hypergraph stochastic block model ($m-$ShSBM). Under the $m-$ShSBM, we use Fanos inequality to identify the region of model parameters where any algorithm fails to exactly recover the planted communities with a large probability. We also identify the region where a Maximum Likelihood Estimation (MLE) algorithm succeeds to exactly recover the communities with high probability. Our bounds are tight and pertain to the community detection problems in various models such as the planted hypergraph stochastic block model, the planted densest sub-hypergraph model, and the planted multipartite hypergraph model.

Machine Learning Machine Learning

Information-theoretic thresholds from the cavity method

62 - Amin Coja-Oghlan , Florent Krzakala , Will Perkins 2016

Vindicating a sophisticated but non-rigorous physics approach called the cavity method, we establish a formula for the mutual information in statistical inference problems induced by random graphs and we show that the mutual information holds the key to understanding certain important phase transitions in random graph models. We work out several concrete applications of these general results. For instance, we pinpoint the exact condensation phase transition in the Potts antiferromagnet on the random graph, thereby improving prior approximate results [Contucci et al.: Communications in Mathematical Physics 2013]. Further, we prove the conjecture from [Krzakala et al.: PNAS 2007] about the condensation phase transition in the random graph coloring problem for any number $qgeq3$ of colors. Moreover, we prove the conjecture on the information-theoretic threshold in the disassortative stochastic block model [Decelle et al.: Phys. Rev. E 2011]. Additionally, our general result implies the conjectured formula for the mutual information in Low-Density Generator Matrix codes [Montanari: IEEE Transactions on Information Theory 2005].

Discrete Mathematics Probability Data Analysis Statistics and Probability

comments

Fetching comments

Al-Etihad University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Information-theoretic thresholds for community detection in sparse networks

Ask ChatGPT about the research

No Arabic abstract

Read More