ترغب بنشر مسار تعليمي؟ اضغط هنا

Asymptotic Mutual Information for the Two-Groups Stochastic Block Model

190   0   0.0 ( 0 )
 نشر من قبل Andrea Montanari
 تاريخ النشر 2015
والبحث باللغة English




اسأل ChatGPT حول البحث

We develop an information-theoretic view of the stochastic block model, a popular statistical model for the large-scale structure of complex networks. A graph $G$ from such a model is generated by first assigning vertex labels at random from a finite alphabet, and then connecting vertices with edge probabilities depending on the labels of the endpoints. In the case of the symmetric two-group model, we establish an explicit `single-letter characterization of the per-vertex mutual information between the vertex labels and the graph. The explicit expression of the mutual information is intimately related to estimation-theoretic quantities, and --in particular-- reveals a phase transition at the critical point for community detection. Below the critical point the per-vertex mutual information is asymptotically the same as if edges were independent. Correspondingly, no algorithm can estimate the partition better than random guessing. Conversely, above the threshold, the per-vertex mutual information is strictly smaller than the independent-edges upper bound. In this regime there exists a procedure that estimates the vertex labels better than random guessing.



قيم البحث

اقرأ أيضاً

We characterize the growth of the Sibson mutual information, of any order that is at least unity, between a random variable and an increasing set of noisy, conditionally independent observations of the random variable. The Sibson mutual information i ncreases to an order-dependent limit exponentially fast, with an exponent that is order-independent. The result is contrasted with composition theorems in differential privacy.
253 - Jian Ma , Zengqi Sun 2008
We prove that mutual information is actually negative copula entropy, based on which a method for mutual information estimation is proposed.
We consider the estimation of a n-dimensional vector x from the knowledge of noisy and possibility non-linear element-wise measurements of xxT , a very generic problem that contains, e.g. stochastic 2-block model, submatrix localization or the spike perturbation of random matrices. We use an interpolation method proposed by Guerra and later refined by Korada and Macris. We prove that the Bethe mutual information (related to the Bethe free energy and conjectured to be exact by Lesieur et al. on the basis of the non-rigorous cavity method) always yields an upper bound to the exact mutual information. We also provide a lower bound using a similar technique. For concreteness, we illustrate our findings on the sparse PCA problem, and observe that (a) our bounds match for a large region of parameters and (b) that it exists a phase transition in a region where the spectum remains uninformative. While we present only the case of rank-one symmetric matrix estimation, our proof technique is readily extendable to low-rank symmetric matrix or low-rank symmetric tensor estimation
Motivated by the prevalent data science applications of processing and mining large-scale graph data such as social networks, web graphs, and biological networks, as well as the high I/O and communication costs of storing and transmitting such data, this paper investigates lossless compression of data appearing in the form of a labeled graph. A universal graph compression scheme is proposed, which does not depend on the underlying statistics/distribution of the graph model. For graphs generated by a stochastic block model, which is a widely used random graph model capturing the clustering effects in social networks, the proposed scheme achieves the optimal theoretical limit of lossless compression without the need to know edge probabilities, community labels, or the number of communities. The key ideas in establishing universality for stochastic block models include: 1) block decomposition of the adjacency matrix of the graph; 2) generalization of the Krichevsky-Trofimov probability assignment, which was initially designed for i.i.d. random processes. In four benchmark graph datasets (protein-to-protein interaction, LiveJournal friendship, Flickr, and YouTube), the compressed files from competing algorithms (including CSR, Ligra+, PNG image compressor, and Lempel-Ziv compressor for two-dimensional data) take 2.4 to 27 times the space needed by the proposed scheme.
107 - Fredrik Rusek , Angel Lozano , 2010
We present a method to compute, quickly and efficiently, the mutual information achieved by an IID (independent identically distributed) complex Gaussian input on a block Rayleigh-faded channel without side information at the receiver. The method acc ommodates both scalar and MIMO (multiple-input multiple-output) settings. Operationally, the mutual information thus computed represents the highest spectral efficiency that can be attained using standard Gaussian codebooks. Examples are provided that illustrate the loss in spectral efficiency caused by fast fading and how that loss is amplified by the use of multiple transmit antennas. These examples are further enriched by comparisons with the channel capacity under perfect channel-state information at the receiver, and with the spectral efficiency attained by pilot-based transmission.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا