A Max-Sum algorithm for training discrete neural networks

240 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Alfredo Braunstein

تاريخ النشر 2015

مجال البحث فيزياء الهندسة المعلوماتية

والبحث باللغة English

تأليف Carlo Baldassi - Alfredo Braunstein

الأنظمة المضطربة والشبكات العصبية التعلم الآلي الحوسبة العصبية والتطورية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We present an efficient learning algorithm for the problem of training neural networks with discrete synapses, a well-known hard (NP-complete) discrete optimization problem. The algorithm is a variant of the so-called Max-Sum (MS) algorithm. In particular, we show how, for bounded integer weights with $q$ distinct states and independent concave a priori distribution (e.g. $l_{1}$ regularization), the algorithms time complexity can be made to scale as $Oleft(Nlog Nright)$ per node update, thus putting it on par with alternative schemes, such as Belief Propagation (BP), without resorting to approximations. Two special cases are of particular interest: binary synapses $Win{-1,1}$ and ternary synapses $Win{-1,0,1}$ with $l_{0}$ regularization. The algorithm we present performs as well as BP on binary perceptron learning problems, and may be better suited to address the problem on fully-connected two-layer networks, since inherent symmetries in two layer networks are naturally broken using the MS approach.

قيم البحث

293 - Carlo Baldassi , Federica Gerace , Hilbert J. Kappen 2017

Stochasticity and limited precision of synaptic weights in neural network models are key aspects of both biological and hardware modeling of learning processes. Here we show that a neural network model with stochastic binary weights naturally gives p rominence to exponentially rare dense regions of solutions with a number of desirable properties such as robustness and good generalization performance, while typical solutions are isolated and hard to find. Binary solutions of the standard perceptron problem are obtained from a simple gradient descent procedure on a set of real values parametrizing a probability distribution over the binary synapses. Both analytical and numerical results are presented. An algorithmic extension aimed at training discrete deep neural networks is also investigated.

الأنظمة المضطربة والشبكات العصبية التعلم الآلي الحوسبة العصبية والتطورية

Biologically Plausible Sequence Learning with Spiking Neural Networks

113 - Zuozhu Liu , Thiparat Chotibut , Christopher Hillar 2019

Motivated by the celebrated discrete-time model of nervous activity outlined by McCulloch and Pitts in 1943, we propose a novel continuous-time model, the McCulloch-Pitts network (MPN), for sequence learning in spiking neural networks. Our model has a local learning rule, such that the synaptic weight updates depend only on the information directly accessible by the synapse. By exploiting asymmetry in the connections between binary neurons, we show that MPN can be trained to robustly memorize multiple spatiotemporal patterns of binary vectors, generalizing the ability of the symmetric Hopfield network to memorize static spatial patterns. In addition, we demonstrate that the model can efficiently learn sequences of binary pictures as well as generative models for experimental neural spike-train data. Our learning rule is consistent with spike-timing-dependent plasticity (STDP), thus providing a theoretical ground for the systematic design of biologically inspired networks with large and robust long-range sequence storage capacity.

الأنظمة المضطربة والشبكات العصبية التعلم الآلي الحوسبة العصبية والتطورية

A Supervised STDP-based Training Algorithm for Living Neural Networks

148 - Yuan Zeng , Kevin Devincentis , Yao Xiao 2017

Neural networks have shown great potential in many applications like speech recognition, drug discovery, image classification, and object detection. Neural network models are inspired by biological neural networks, but they are optimized to perform m achine learning tasks on digital computers. The proposed work explores the possibilities of using living neural networks in vitro as basic computational elements for machine learning applications. A new supervised STDP-based learning algorithm is proposed in this work, which considers neuron engineering constrains. A 74.7% accuracy is achieved on the MNIST benchmark for handwritten digit recognition.

الحوسبة العصبية والتطورية التعلم الآلي الأساليب الكمية

Genetic-algorithm-optimized neural networks for gravitational wave classification

386 - Dwyer S. Deighan , Scott E. Field , Collin D. Capano 2020

Gravitational-wave detection strategies are based on a signal analysis technique known as matched filtering. Despite the success of matched filtering, due to its computational cost, there has been recent interest in developing deep convolutional neur al networks (CNNs) for signal detection. Designing these networks remains a challenge as most procedures adopt a trial and error strategy to set the hyperparameter values. We propose a new method for hyperparameter optimization based on genetic algorithms (GAs). We compare six different GA variants and explore different choices for the GA-optimized fitness score. We show that the GA can discover high-quality architectures when the initial hyperparameter seed values are far from a good solution as well as refining already good networks. For example, when starting from the architecture proposed by George and Huerta, the network optimized over the 20-dimensional hyperparameter space has 78% fewer trainable parameters while obtaining an 11% increase in accuracy for our test problem. Using genetic algorithm optimization to refine an existing network should be especially useful if the problem context (e.g. statistical properties of the noise, signal model, etc) changes and one needs to rebuild a network. In all of our experiments, we find the GA discovers significantly less complicated networks as compared to the seed network, suggesting it can be used to prune wasteful network structures. While we have restricted our attention to CNN classifiers, our GA hyperparameter optimization strategy can be applied within other machine learning settings.

النسبية العامة وهدية الكونيات الكم التعلم الآلي الحوسبة العصبية والتطورية

Training Restricted Boltzmann Machines via the Thouless-Anderson-Palmer Free Energy

742 - Marylou Gabrie , Eric W. Tramel , Florent Krzakala 2015

Restricted Boltzmann machines are undirected neural networks which have been shown to be effective in many applications, including serving as initializations for training deep multi-layer neural networks. One of the main reasons for their success is the existence of efficient and practical stochastic algorithms, such as contrastive divergence, for unsupervised training. We propose an alternative deterministic iterative procedure based on an improved mean field method from statistical physics known as the Thouless-Anderson-Palmer approach. We demonstrate that our algorithm provides performance equal to, and sometimes superior to, persistent contrastive divergence, while also providing a clear and easy to evaluate objective function. We believe that this strategy can be easily generalized to other models as well as to more accurate higher-order approximations, paving the way for systematic improvements in training Boltzmann machines with hidden units.

الأنظمة المضطربة والشبكات العصبية التعلم الآلي الحوسبة العصبية والتطورية