No Arabic abstract
We consider restricted Boltzmann machine (RBMs) trained over an unstructured dataset made of blurred copies of definite but unavailable ``archetypes and we show that there exists a critical sample size beyond which the RBM can learn archetypes, namely the machine can successfully play as a generative model or as a classifier, according to the operational routine. In general, assessing a critical sample size (possibly in relation to the quality of the dataset) is still an open problem in machine learning. Here, restricting to the random theory, where shallow networks suffice and the grand-mother cell scenario is correct, we leverage the formal equivalence between RBMs and Hopfield networks, to obtain a phase diagram for both the neural architectures which highlights regions, in the space of the control parameters (i.e., number of archetypes, number of neurons, size and quality of the training set), where learning can be accomplished. Our investigations are led by analytical methods based on the statistical-mechanics of disordered systems and results are further corroborated by extensive Monte Carlo simulations.
We propose a new framework to understand how quantum effects may impact on the dynamics of neural networks. We implement the dynamics of neural networks in terms of Markovian open quantum systems, which allows us to treat thermal and quantum coherent effects on the same footing. In particular, we propose an open quantum generalisation of the celebrated Hopfield neural network, the simplest toy model of associative memory. We determine its phase diagram and show that quantum fluctuations give rise to a qualitatively new non-equilibrium phase. This novel phase is characterised by limit cycles corresponding to high-dimensional stationary manifolds that may be regarded as a generalisation of storage patterns to the quantum domain.
Recent advances in deep learning and neural networks have led to an increased interest in the application of generative models in statistical and condensed matter physics. In particular, restricted Boltzmann machines (RBMs) and variational autoencoders (VAEs) as specific classes of neural networks have been successfully applied in the context of physical feature extraction and representation learning. Despite these successes, however, there is only limited understanding of their representational properties and limitations. To better understand the representational characteristics of RBMs and VAEs, we study their ability to capture physical features of the Ising model at different temperatures. This approach allows us to quantitatively assess learned representations by comparing sample features with corresponding theoretical predictions. Our results suggest that the considered RBMs and convolutional VAEs are able to capture the temperature dependence of magnetization, energy, and spin-spin correlations. The samples generated by RBMs are more evenly distributed across temperature than those generated by VAEs. We also find that convolutional layers in VAEs are important to model spin correlations whereas RBMs achieve similar or even better performances without convolutional filters.
The effects of a variable amount of random dilution of the synaptic couplings in Q-Ising multi-state neural networks with Hebbian learning are examined. A fraction of the couplings is explicitly allowed to be anti-Hebbian. Random dilution represents the dying or pruning of synapses and, hence, a static disruption of the learning process which can be considered as a form of multiplicative noise in the learning rule. Both parallel and sequential updating of the neurons can be treated. Symmetric dilution in the statics of the network is studied using the mean-field theory approach of statistical mechanics. General dilution, including asymmetric pruning of the couplings, is examined using the generating functional (path integral) approach of disordered systems. It is shown that random dilution acts as additive gaussian noise in the Hebbian learning rule with a mean zero and a variance depending on the connectivity of the network and on the symmetry. Furthermore, a scaling factor appears that essentially measures the average amount of anti-Hebbian couplings.
The dynamics of neural networks is often characterized by collective behavior and quasi-synchronous events, where a large fraction of neurons fire in short time intervals, separated by uncorrelated firing activity. These global temporal signals are crucial for brain functioning. They strongly depend on the topology of the network and on the fluctuations of the connectivity. We propose a heterogeneous mean--field approach to neural dynamics on random networks, that explicitly preserves the disorder in the topology at growing network sizes, and leads to a set of self-consistent equations. Within this approach, we provide an effective description of microscopic and large scale temporal signals in a leaky integrate-and-fire model with short term plasticity, where quasi-synchronous events arise. Our equations provide a clear analytical picture of the dynamics, evidencing the contributions of both periodic (locked) and aperiodic (unlocked) neurons to the measurable average signal. In particular, we formulate and solve a global inverse problem of reconstructing the in-degree distribution from the knowledge of the average activity field. Our method is very general and applies to a large class of dynamical models on dense random networks.
The inclusion of a macroscopic adaptive threshold is studied for the retrieval dynamics of both layered feedforward and fully connected neural network models with synaptic noise. These two types of architectures require a different method to be solved numerically. In both cases it is shown that, if the threshold is chosen appropriately as a function of the cross-talk noise and of the activity of the stored patterns, adapting itself automatically in the course of the recall process, an autonomous functioning of the network is guaranteed. This self-control mechanism considerably improves the quality of retrieval, in particular the storage capacity, the basins of attraction and the mutual information content.