Large deviations for the perceptron model and consequences for active learning

329 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Hugo Cui

تاريخ النشر 2019

مجال البحث فيزياء الهندسة المعلوماتية

والبحث باللغة English

تأليف Hugo Cui - Luca Saglietti - Lenka Zdeborova

الأنظمة المضطربة والشبكات العصبية التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Active learning is a branch of machine learning that deals with problems where unlabeled data is abundant yet obtaining labels is expensive. The learning algorithm has the possibility of querying a limited number of samples to obtain the corresponding labels, subsequently used for supervised learning. In this work, we consider the task of choosing the subset of samples to be labeled from a fixed finite pool of samples. We assume the pool of samples to be a random matrix and the ground truth labels to be generated by a single-layer teacher random neural network. We employ replica methods to analyze the large deviations for the accuracy achieved after supervised learning on a subset of the original pool. These large deviations then provide optimal achievable performance boundaries for any active learning algorithm. We show that the optimal learning performance can be efficiently approached by simple message-passing active learning algorithms. We also provide a comparison with the performance of some other popular active learning strategies.

قيم البحث

58 - Hugo Cui , Luca Saglietti , Lenka Zdeborova 2021

In community detection on graphs, the semi-supervised learning problem entails inferring the ground-truth membership of each node in a graph, given the connectivity structure and a limited number of revealed node labels. Different subsets of revealed labels can in principle lead to higher or lower information gains and induce different reconstruction accuracies. In the framework of the dense stochastic block model, we employ statistical physics methods to derive a large deviation analysis for this problem, in the high-dimensional limit. This analysis allows the characterization of the fluctuations around the typical behaviour, capturing the effect of correlated label choices and yielding an estimate of their informativeness and their rareness among subsets of the same size. We find theoretical evidence of a non-monotonic relationship between reconstruction accuracy and the free energy associated to the posterior measure of the inference problem. We further discuss possible implications for active learning applications in community detection.

الأنظمة المضطربة والشبكات العصبية

Generalization learning in a perceptron with binary synapses

510 - Carlo Baldassi 2012

We consider the generalization problem for a perceptron with binary synapses, implementing the Stochastic Belief-Propagation-Inspired (SBPI) learning algorithm which we proposed earlier, and perform a mean-field calculation to obtain a differential e quation which describes the behaviour of the device in the limit of a large number of synapses N. We show that the solving time of SBPI is of order N*sqrt(log(N)), while the similar, well-known clipped perceptron (CP) algorithm does not converge to a solution at all in the time frame we considered. The analysis gives some insight into the ongoing process and shows that, in this context, the SBPI algorithm is equivalent to a new, simpler algorithm, which only differs from the CP algorithm by the addition of a stochastic, unsupervised meta-plastic reinforcement process, whose rate of application must be less than sqrt(2/(pi * N)) for the learning to be achieved effectively. The analytical results are confirmed by simulations.

الأنظمة المضطربة والشبكات العصبية

Large deviations in spin-glass ground-state energies

59 - A. Andreanov , F. Barbieri , O. C. Martin 2003

The ground-state energy E_0 of a spin glass is an example of an extreme statistic. We consider the large deviations of this energy for a variety of models when the number of spins N goes to infinity. In most cases, the behavior can be understood qual itatively, in particular with the help of semi-analytical results for hierarchical lattices. Particular attention is paid to the Sherrington-Kirkpatrick model; after comparing to the Tracy-Widom distribution which follows from the spherical approximation, we find that the large deviations give rise to non-trivial scaling laws with N.

الأنظمة المضطربة والشبكات العصبية

Theory and learning protocols for the material tempotron model

340 - Carlo Baldassi , Alfredo Braunstein , Riccardo Zecchina 2013

Neural networks are able to extract information from the timing of spikes. Here we provide new results on the behavior of the simplest neuronal model which is able to decode information embedded in temporal spike patterns, the so called tempotron. Us ing statistical physics techniques we compute the capacity for the case of sparse, time-discretized input, and material discrete synapses, showing that the device saturates the information theoretic bounds with a statistics of output spikes that is consistent with the statistics of the inputs. We also derive two simple and highly efficient learning algorithms which are able to learn a number of associations which are close to the theoretical limit. The simple

الأنظمة المضطربة والشبكات العصبية الميكانيكا الإحصائية الخلايا العصبية والإدراك

Evolutionary reinforcement learning of dynamical large deviations

119 - Stephen Whitelam , Daniel Jacobson , Isaac Tamblyn 2019

We show how to calculate the likelihood of dynamical large deviations using evolutionary reinforcement learning. An agent, a stochastic model, propagates a continuous-time Monte Carlo trajectory and receives a reward conditioned upon the values of ce rtain path-extensive quantities. Evolution produces progressively fitter agents, eventually allowing the calculation of a piece of a large-deviation rate function for a particular model and path-extensive quantity. For models with small state spaces the evolutionary process acts directly on rates, and for models with large state spaces the process acts on the weights of a neural network that parameterizes the models rates. This approach shows how path-extensive physics problems can be considered within a framework widely used in machine learning.

الميكانيكا الإحصائية التعلم الآلي الحوسبة العصبية والتطورية