Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Marvels and Pitfalls of the Langevin Algorithm in Noisy High-dimensional Inference

63 0 0.0 ( 0 )

Download Cite

Added by Stefano Sarao Mannelli

Publication date 2018

fields Informatics Engineering Physics

and research's language is English

Authors Stefano Sarao Mannelli - Giulio Biroli - Chiara Cammarota

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Gradient-descent-based algorithms and their stochast

rate research

Wide flat minima and optimal generalization in classifying high-dimensional Gaussian mixtures

182 - Carlo Baldassi , Enrico M. Malatesta , Matteo Negri 2020

We analyze the connection between minimizers with good generalizing properties and high local entropy regions of a threshold-linear classifier in Gaussian mixtures with the mean squared error loss function. We show that there exist configurations that achieve the Bayes-optimal generalization error, even in the case of unbalanced clusters. We explore analytically the error-counting loss landscape in the vicinity of a Bayes-optimal solution, and show that the closer we get to such configurations, the higher the local entropy, implying that the Bayes-optimal solution lays inside a wide flat region. We also consider the algorithmically relevant case of targeting wide flat minima of the (differentiable) mean squared error loss. Our analytical and numerical results show not only that in the balanced case the dependence on the norm of the weights is mild, but also, in the unbalanced case, that the performances can be improved.

Machine Learning Disordered Systems and Neural Networks Statistics Theory

Emergent limits of an indirect measurement from phase transitions of inference

177 - Satoru Tokuda , Kenji Nagata , Masato Okada 2020

Measurements are inseparable from inference, where the estimation of signals of interest from other observations is called an indirect measurement. While a variety of measurement limits have been defined by the physical constraint on each setup, the fundamental limit of an indirect measurement is essentially the limit of inference. Here, we propose the concept of statistical limits on indirect measurement: the bounds of distinction between signals and noise and between a signal and another signal. By developing the asymptotic theory of Bayesian regression, we investigate the phenomenology of a typical indirect measurement and demonstrate the existence of these limits. Based on the connection between inference and statistical physics, we also provide a unified interpretation in which these limits emerge from phase transitions of inference. Our results could pave the way for novel experimental design, enabling assess to the required quality of observations according to the assumed ground truth before the concerned indirect measurement is actually performed.

Data Analysis Statistics and Probability Disordered Systems and Neural Networks Statistics Theory

Thresholds of descending algorithms in inference problems

93 - Stefano Sarao Mannelli , Lenka Zdeborova 2020

We review recent works on analyzing the dynamics of gradient-based algorithms in a prototypical statistical inference problem. Using methods and insights from the physics of glassy systems, these works showed how to understand quantitatively and qualitatively the performance of gradient-based algorithms. Here we review the key results and their interpretation in non-technical terms accessible to a wide audience of physicists in the context of related works.

Machine Learning Disordered Systems and Neural Networks Machine Learning

Passed & Spurious: Descent Algorithms and Local Minima in Spiked Matrix-Tensor Models

150 - Stefano Sarao Mannelli , Florent Krzakala , Pierfrancesco Urbani 2019

In this work we analyse quantitatively the interplay between the loss landscape and performance of descent algorithms in a prototypical inference problem, the spiked matrix-tensor model. We study a loss function that is the negative log-likelihood of the model. We analyse the number of local minima at a fixed distance from the signal/spike with the Kac-Rice formula, and locate trivialization of the landscape at large signal-to-noise ratios. We evaluate in a closed form the performance of a gradient flow algorithm using integro-differential PDEs as developed in physics of disordered systems for the Langevin dynamics. We analyze the performance of an approximate message passing algorithm estimating the maximum likelihood configuration via its state evolution. We conclude by comparing the above results: while we observe a drastic slow down of the gradient flow dynamics even in the region where the landscape is trivial, both the analyzed algorithms are shown to perform well even in the part of the region of parameters where spurious local minima are present.

Machine Learning Disordered Systems and Neural Networks Statistics Theory

Who is Afraid of Big Bad Minima? Analysis of Gradient-Flow in a Spiked Matrix-Tensor Model

276 - Stefano Sarao Mannelli , Giulio Biroli , Chiara Cammarota 2019

Gradient-based algorithms are effective for many machine learning tasks, but despite ample recent effort and some progress, it often remains unclear why they work in practice in optimising high-dimensional non-convex functions and why they find good minima instead of being trapped in spurious ones. Here we present a quantitative theory explaining this behaviour in a spiked matrix-tensor model. Our framework is based on the Kac-Rice analysis of stationary points and a closed-form analysis of gradient-flow originating from statistical physics. We show that there is a well defined region of parameters where the gradient-flow algorithm finds a good global minimum despite the presence of exponentially many spurious local minima. We show that this is achieved by surfing on saddles that have strong negative direction towards the global minima, a phenomenon that is connected to a BBP-type threshold in the Hessian describing the critical points of the landscapes.

Machine Learning Disordered Systems and Neural Networks Statistics Theory

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Marvels and Pitfalls of the Langevin Algorithm in Noisy High-dimensional Inference

Ask ChatGPT about the research

No Arabic abstract

Gradient-descent-based algorithms and their stochast

Read More

suggested questions