ترغب بنشر مسار تعليمي؟ اضغط هنا

Information-theoretic thresholds for community detection in sparse networks

56   0   0.0 ( 0 )
 نشر من قبل Jess Banks
 تاريخ النشر 2016
  مجال البحث فيزياء
والبحث باللغة English




اسأل ChatGPT حول البحث

We give upper and lower bounds on the information-theoretic threshold for community detection in the stochastic block model. Specifically, let $k$ be the number of groups, $d$ be the average degree, the probability of edges between vertices within and between groups be $c_mathrm{in}/n$ and $c_mathrm{out}/n$ respectively, and let $lambda = (c_mathrm{in}-c_mathrm{out})/(kd)$. We show that, when $k$ is large, and $lambda = O(1/k)$, the critical value of $d$ at which community detection becomes possible -- in physical terms, the condensation threshold -- is [ d_c = Theta!left( frac{log k}{k lambda^2} right) , , ] with tighter results in certain regimes. Above this threshold, we show that the only partitions of the nodes into $k$ groups are correlated with the ground truth, giving an exponential-time algorithm that performs better than chance -- in particular, detection is possible for $k ge 5$ in the disassortative case $lambda < 0$ and for $k ge 11$ in the assortative case $lambda > 0$. (Similar upper bounds were obtained independently by Abbe and Sandon.) Below this threshold, we use recent results of Neeman and Netrapalli (who generalized arguments of Mossel, Neeman, and Sly) to show that no algorithm can label the vertices better than chance, or even distinguish the block model from an ErdH{o}s-Renyi random graph with high probability. We also rely on bounds on certain functions of doubly stochastic matrices due to Achlioptas and Naor; indeed, our lower bound on $d_c$ is the second moment lower bound on the $k$-colorability threshold for random graphs with a certain effective degree.



قيم البحث

اقرأ أيضاً

We give upper and lower bounds on the information-theoretic threshold for community detection in the stochastic block model. Specifically, consider the symmetric stochastic block model with $q$ groups, average degree $d$, and connection probabilities $c_text{in}/n$ and $c_text{out}/n$ for within-group and between-group edges respectively; let $lambda = (c_text{in}-c_text{out})/(qd)$. We show that, when $q$ is large, and $lambda = O(1/q)$, the critical value of $d$ at which community detection becomes possible---in physical terms, the condensation threshold---is [ d_text{c} = Theta!left( frac{log q}{q lambda^2} right) , , ] with tighter results in certain regimes. Above this threshold, we show that any partition of the nodes into $q$ groups which is as `good as the planted one, in terms of the number of within- and between-group edges, is correlated with it. This gives an exponential-time algorithm that performs better than chance; specifically, community detection becomes possible below the Kesten-Stigum bound for $q ge 5$ in the disassortative case $lambda < 0$, and for $q ge 11$ in the assortative case $lambda >0$ (similar upper bounds were obtained independently by Abbe and Sandon). Conversely, below this threshold, we show that no algorithm can label the vertices better than chance, or even distinguish the block model from an ER random graph with high probability. Our lower bound on $d_text{c}$ uses Robinson and Wormalds small subgraph conditioning method, and we also give (less explicit) results for non-symmetric stochastic block models. In the symmetric case, we obtain explicit results by using bounds on certain functions of doubly stochastic matrices due to Achlioptas and Naor; indeed, our lower bound on $d_text{c}$ is their second moment lower bound on the $q$-colorability threshold for random graphs with a certain effective degree.
119 - Soumik Pal , Yizhe Zhu 2019
We consider the community detection problem in sparse random hypergraphs. Angelini et al. (2015) conjectured the existence of a sharp threshold on model parameters for community detection in sparse hypergraphs generated by a hypergraph stochastic blo ck model. We solve the positive part of the conjecture for the case of two blocks: above the threshold, there is a spectral algorithm which asymptotically almost surely constructs a partition of the hypergraph correlated with the true partition. Our method is a generalization to random hypergraphs of the method developed by Massouli{e} (2014) for sparse random graphs.
In the group testing problem we aim to identify a small number of infected individuals within a large population. We avail ourselves to a procedure that can test a group of multiple individuals, with the test result coming out positive iff at least o ne individual in the group is infected. With all tests conducted in parallel, what is the least number of tests required to identify the status of all individuals? In a recent test design [Aldridge et al. 2016] the individuals are assigned to test groups randomly, with every individual joining an equal number of groups. We pinpoint the sharp threshold for the number of tests required in this randomised design so that it is information-theoretically possible to infer the infection status of every individual. Moreover, we analyse two efficient inference algorithms. These results settle conjectures from [Aldridge et al. 2014, Johnson et al. 2019].
In this paper, we study the information theoretic bounds for exact recovery in sub-hypergraph models for community detection. We define a general model called the $m-$uniform sub-hypergraph stochastic block model ($m-$ShSBM). Under the $m-$ShSBM, we use Fanos inequality to identify the region of model parameters where any algorithm fails to exactly recover the planted communities with a large probability. We also identify the region where a Maximum Likelihood Estimation (MLE) algorithm succeeds to exactly recover the communities with high probability. Our bounds are tight and pertain to the community detection problems in various models such as the planted hypergraph stochastic block model, the planted densest sub-hypergraph model, and the planted multipartite hypergraph model.
Vindicating a sophisticated but non-rigorous physics approach called the cavity method, we establish a formula for the mutual information in statistical inference problems induced by random graphs and we show that the mutual information holds the key to understanding certain important phase transitions in random graph models. We work out several concrete applications of these general results. For instance, we pinpoint the exact condensation phase transition in the Potts antiferromagnet on the random graph, thereby improving prior approximate results [Contucci et al.: Communications in Mathematical Physics 2013]. Further, we prove the conjecture from [Krzakala et al.: PNAS 2007] about the condensation phase transition in the random graph coloring problem for any number $qgeq3$ of colors. Moreover, we prove the conjecture on the information-theoretic threshold in the disassortative stochastic block model [Decelle et al.: Phys. Rev. E 2011]. Additionally, our general result implies the conjectured formula for the mutual information in Low-Density Generator Matrix codes [Montanari: IEEE Transactions on Information Theory 2005].
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا