Entropic alternatives to initialization

79 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Daniele Musso

تاريخ النشر 2021

مجال البحث فيزياء

والبحث باللغة English

تأليف Daniele Musso

الأنظمة المضطربة والشبكات العصبية الميكانيكا الإحصائية التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Local entropic loss functions provide a versatile framework to define architecture-aware regularization procedures. Besides the possibility of being anisotropic in the synaptic space, the local entropic smoothening of the loss function can vary during training, thus yielding a tunable model complexity. A scoping protocol where the regularization is strong in the early-stage of the training and then fades progressively away constitutes an alternative to standard initialization procedures for deep convolutional neural networks, nonetheless, it has wider applicability. We analyze anisotropic, local entropic smoothenings in the language of statistical physics and information theory, providing insight into both their interpretation and workings. We comment some aspects related to the physics of renormalization and the spacetime structure of convolutional networks.

قيم البحث

396 - Matteo Bellitti , Federico Ricci-Tersenghi , Antonello Scardicchio 2021

We study both classical and quantum algorithms to solve a hard optimization problem, namely 3-XORSAT on 3-regular random graphs. By introducing a new quasi-greedy algorithm that is not allowed to jump over large energy barriers, we show that the prob lem hardness is mainly due to entropic barriers. We study, both analytically and numerically, several optimization algorithms, finding that entropic barriers affect in a similar way classical local algorithms and quantum annealing. For the adiabatic algorithm, the difficulty we identify is distinct from that of tunnelling under large barriers, but does, nonetheless, give rise to exponential running (annealing) times.

الأنظمة المضطربة والشبكات العصبية الميكانيكا الإحصائية فيزياء الكم

Minimum-Free-Energy Distribution of RNA Secondary Structures: Entropic and Thermodynamic Properties of Rare Events

377 - S. Wolfsheimer , A.K. Hartmann 2010

We study the distribution of the minimum free energy (MFE) for the Turner model of pseudoknot free RNA secondary structures over ensembles of random RNA sequences. In particular, we are interested in those rare and intermediate events of unexpected l ow MFEs. Generalized ensemble Markov-chain Monte Carlo methods allow us to explore the rare-event tail of the MFE distribution down to probabilities like $10^{-70}$ and to study the relationship between the sequence entropy and structural properties for sequence ensembles with fixed MFEs. Entropic and structural properties of those ensembles are compared with natural RNA of the same reduced MFE (z-score).

الأنظمة المضطربة والشبكات العصبية الميكانيكا الإحصائية

Entropic Effects in the Very Low Temperature Regime of Diluted Ising Spin Glasses with Discrete Couplings

387 - Thomas Jorg , Federico Ricci-Tersenghi 2009

We study link-diluted $pm J$ Ising spin glass models on the hierarchical lattice and on a three-dimensional lattice close to the percolation threshold. We show that previously computed zero temperature fixed points are unstable with respect to temper ature perturbations and do not belong to any critical line in the dilution-temperature plane. We discuss implications of the presence of such spurious unstable fixed points on the use of optimization algorithms, and we show how entropic effects should be taken into account to obtain the right physical behavior and critical points.

الأنظمة المضطربة والشبكات العصبية الميكانيكا الإحصائية

From complex to simple : hierarchical free-energy landscape renormalized in deep neural networks

56 - Hajime Yoshino 2019

We develop a statistical mechanical approach based on the replica method to study the design space of deep and wide neural networks constrained to meet a large number of training data. Specifically, we analyze the configuration space of the synaptic weights and neurons in the hidden layers in a simple feed-forward perceptron network for two scenarios: a setting with random inputs/outputs and a teacher-student setting. By increasing the strength of constraints,~i.e. increasing the number of training data, successive 2nd order glass transition (random inputs/outputs) or 2nd order crystalline transition (teacher-student setting) take place layer-by-layer starting next to the inputs/outputs boundaries going deeper into the bulk with the thickness of the solid phase growing logarithmically with the data size. This implies the typical storage capacity of the network grows exponentially fast with the depth. In a deep enough network, the central part remains in the liquid phase. We argue that in systems of finite width N, the weak bias field can remain in the center and plays the role of a symmetry-breaking field that connects the opposite sides of the system. The successive glass transitions bring about a hierarchical free-energy landscape with ultrametricity, which evolves in space: it is most complex close to the boundaries but becomes renormalized into progressively simpler ones in deeper layers. These observations provide clues to understand why deep neural networks operate efficiently. Finally, we present some numerical simulations of learning which reveal spatially heterogeneous glassy dynamics truncated by a finite width $N$ effect.

الأنظمة المضطربة والشبكات العصبية الميكانيكا الإحصائية التعلم الآلي

Adding boundary terms to Anderson localized Hamiltonians leads to unbounded growth of entanglement

154 - Yichen Huang 2021

It is well known that in Anderson localized systems, starting from a random product state the entanglement entropy remains bounded at all times. However, we show that adding a single boundary term to an otherwise Anderson localized Hamiltonian leads to unbounded growth of entanglement. Our results imply that Anderson localization is not a local property. One cannot conclude that a subsystem has Anderson localized behavior without looking at the whole system, as a term that is arbitrarily far from the subsystem can affect the dynamics of the subsystem in such a way that the features of Anderson localization are lost.

الأنظمة المضطربة والشبكات العصبية الميكانيكا الإحصائية فيزياء الكم