ﻻ يوجد ملخص باللغة العربية
In community detection on graphs, the semi-supervised learning problem entails inferring the ground-truth membership of each node in a graph, given the connectivity structure and a limited number of revealed node labels. Different subsets of revealed labels can in principle lead to higher or lower information gains and induce different reconstruction accuracies. In the framework of the dense stochastic block model, we employ statistical physics methods to derive a large deviation analysis for this problem, in the high-dimensional limit. This analysis allows the characterization of the fluctuations around the typical behaviour, capturing the effect of correlated label choices and yielding an estimate of their informativeness and their rareness among subsets of the same size. We find theoretical evidence of a non-monotonic relationship between reconstruction accuracy and the free energy associated to the posterior measure of the inference problem. We further discuss possible implications for active learning applications in community detection.
Active learning is a branch of machine learning that deals with problems where unlabeled data is abundant yet obtaining labels is expensive. The learning algorithm has the possibility of querying a limited number of samples to obtain the correspondin
Random constraint satisfaction problems undergo several phase transitions as the ratio between the number of constraints and the number of variables is varied. When this ratio exceeds the satisfiability threshold no more solutions exist; the satisfia
The ground-state energy E_0 of a spin glass is an example of an extreme statistic. We consider the large deviations of this energy for a variety of models when the number of spins N goes to infinity. In most cases, the behavior can be understood qual
Supervised models of NLP rely on large collections of text which closely resemble the intended testing setting. Unfortunately matching text is often not available in sufficient quantity, and moreover, within any domain of text, data is often highly h
Mesoscopic pattern extraction (MPE) is the problem of finding a partition of the nodes of a complex network that maximizes some objective function. Many well-known network inference problems fall in this category, including, for instance, community d