ترغب بنشر مسار تعليمي؟ اضغط هنا

Training Restricted Boltzmann Machines via the Thouless-Anderson-Palmer Free Energy

519   0   0.0 ( 0 )
 نشر من قبل Eric Tramel
 تاريخ النشر 2015
والبحث باللغة English




اسأل ChatGPT حول البحث

Restricted Boltzmann machines are undirected neural networks which have been shown to be effective in many applications, including serving as initializations for training deep multi-layer neural networks. One of the main reasons for their success is the existence of efficient and practical stochastic algorithms, such as contrastive divergence, for unsupervised training. We propose an alternative deterministic iterative procedure based on an improved mean field method from statistical physics known as the Thouless-Anderson-Palmer approach. We demonstrate that our algorithm provides performance equal to, and sometimes superior to, persistent contrastive divergence, while also providing a clear and easy to evaluate objective function. We believe that this strategy can be easily generalized to other models as well as to more accurate higher-order approximations, paving the way for systematic improvements in training Boltzmann machines with hidden units.


قيم البحث

اقرأ أيضاً

The search for novel entangled phases of matter has lead to the recent discovery of a new class of ``entanglement transitions, exemplified by random tensor networks and monitored quantum circuits. Most known examples can be understood as some classic al ordering transitions in an underlying statistical mechanics model, where entanglement maps onto the free energy cost of inserting a domain wall. In this paper, we study the possibility of entanglement transitions driven by physics beyond such statistical mechanics mappings. Motivated by recent applications of neural network-inspired variational Ansatze, we investigate under what conditions on the variational parameters these Ansatze can capture an entanglement transition. We study the entanglement scaling of short-range restricted Boltzmann machine (RBM) quantum states with random phases. For uncorrelated random phases, we analytically demonstrate the absence of an entanglement transition and reveal subtle finite size effects in finite size numerical simulations. Introducing phases with correlations decaying as $1/r^alpha$ in real space, we observe three regions with a different scaling of entanglement entropy depending on the exponent $alpha$. We study the nature of the transition between these regions, finding numerical evidence for critical behavior. Our work establishes the presence of long-range correlated phases in RBM-based wave functions as a required ingredient for entanglement transitions.
101 - Clement Roussel 2021
Restricted Boltzmann Machines (RBM) are bi-layer neural networks used for the unsupervised learning of model distributions from data. The bipartite architecture of RBM naturally defines an elegant sampling procedure, called Alternating Gibbs Sampling (AGS), where the configurations of the latent-variable layer are sampled conditional to the data-variable layer, and vice versa. We study here the performance of AGS on several analytically tractable models borrowed from statistical mechanics. We show that standard AGS is not more efficient than classical Metropolis-Hastings (MH) sampling of the effective energy landscape defined on the data layer. However, RBM can identify meaningful representations of training data in their latent space. Furthermore, using these representations and combining Gibbs sampling with the MH algorithm in the latent space can enhance the sampling performance of the RBM when the hidden units encode weakly dependent features of the data. We illustrate our findings on three datasets: Bars and Stripes and MNIST, well known in machine learning, and the so-called Lattice Proteins, introduced in theoretical biology to study the sequence-to-structure mapping in proteins.
We propose a novel quantum model for the restricted Boltzmann machine (RBM), in which the visible units remain classical whereas the hidden units are quantized as noninteracting fermions. The free motion of the fermions is parametrically coupled to t he classical signal of the visible units. This model possesses a quantum behaviour such as coherences between the hidden units. Numerical experiments show that this fact makes it more powerful than the classical RBM with the same number of hidden units. At the same time, a significant advantage of the proposed model over the other approaches to the Quantum Boltzmann Machine (QBM) is that it is exactly solvable and efficiently trainable on a classical computer: there is a closed expression for the log-likelihood gradient with respect to its parameters. This fact makes it interesting not only as a model of a hypothetical quantum simulator, but also as a quantum-inspired classical machine-learning algorithm.
112 - Guido Montufar 2018
The restricted Boltzmann machine is a network of stochastic units with undirected interactions between pairs of visible and hidden units. This model was popularized as a building block of deep learning architectures and has continued to play an impor tant role in applied and theoretical machine learning. Restricted Boltzmann machines carry a rich structure, with connections to geometry, applied algebra, probability, statistics, machine learning, and other areas. The analysis of these models is attractive in its own right and also as a platform to combine and generalize mathematical tools for graphical models with hidden variables. This article gives an introduction to the mathematical analysis of restricted Boltzmann machines, reviews recent results on the geometry of the sets of probability distributions representable by these models, and suggests a few directions for further investigation.
Restricted Boltzmann machines (RBMs) are energy-based neural-networks which are commonly used as the building blocks for deep architectures neural architectures. In this work, we derive a deterministic framework for the training, evaluation, and use of RBMs based upon the Thouless-Anderson-Palmer (TAP) mean-field approximation of widely-connected systems with weak interactions coming from spin-glass theory. While the TAP approach has been extensively studied for fully-visible binary spin systems, our construction is generalized to latent-variable models, as well as to arbitrarily distributed real-valued spin systems with bounded support. In our numerical experiments, we demonstrate the effective deterministic training of our proposed models and are able to show interesting features of unsupervised learning which could not be directly observed with sampling. Additionally, we demonstrate how to utilize our TAP-based framework for leveraging trained RBMs as joint priors in denoising problems.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا