Scalable Training of Artificial Neural Networks with Adaptive Sparse Connectivity inspired by Network Science

99 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Decebal Constantin Mocanu

تاريخ النشر 2017

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Decebal Constantin Mocanu - Elena Mocanu - Peter Stone

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Through the success of deep learning in various domains, artificial neural networks are currently among the most used artificial intelligence methods. Taking inspiration from the network properties of biological neural networks (e.g. sparsity, scale-freeness), we argue that (contrary to general practice) artificial neural networks, too, should not have fully-connected layers. Here we propose sparse evolutionary training of artificial neural networks, an algorithm which evolves an initial sparse topology (ErdH{o}s-Renyi random graph) of two consecutive layers of neurons into a scale-free topology, during learning. Our method replaces artificial neural networks fully-connected layers with sparse ones before training, reducing quadratically the number of parameters, with no decrease in accuracy. We demonstrate our claims on restricted Boltzmann machines, multi-layer perceptrons, and convolutional neural networks for unsupervised and supervised learning on 15 datasets. Our approach has the potential to enable artificial neural networks to scale up beyond what is currently possible.

قيم البحث

193 - Leendert A Remmelzwaal , George F R Ellis , Jonathan Tapson 2019

In this paper we introduce a novel Salience Affected Artificial Neural Network (SANN) that models the way neuromodulators such as dopamine and noradrenaline affect neural dynamics in the human brain by being distributed diffusely through neocortical regions, allowing both salience signals to modulate cognition immediately, and one time learning to take place through strengthening entire patterns of activation at one go. We present a model that is capable of one-time salience tagging in a neural network trained to classify objects, and returns a salience response during classification (inference). We explore the effects of salience on learning via its effect on the activation functions of each node, as well as on the strength of weights between nodes in the network. We demonstrate that salience tagging can improve classification confidence for both the individual image as well as the class of images it belongs to. We also show that the computation impact of producing a salience response is minimal. This research serves as a proof of concept, and could be the first step towards introducing salience tagging into Deep Learning Networks and robotics.

الحوسبة العصبية والتطورية الخلايا العصبية والإدراك

On the Binding Problem in Artificial Neural Networks

177 - Klaus Greff , Sjoerd van Steenkiste , Jurgen Schmidhuber 2020

Contemporary neural networks still fall short of human-level generalization, which extends far beyond our direct experiences. In this paper, we argue that the underlying cause for this shortcoming is their inability to dynamically and flexibly bind i nformation that is distributed throughout the network. This binding problem affects their capacity to acquire a compositional understanding of the world in terms of symbol-like entities (like objects), which is crucial for generalizing in predictable and systematic ways. To address this issue, we propose a unifying framework that revolves around forming meaningful entities from unstructured sensory inputs (segregation), maintaining this separation of information at a representational level (representation), and using these entities to construct new inferences, predictions, and behaviors (composition). Our analysis draws inspiration from a wealth of research in neuroscience and cognitive psychology, and surveys relevant mechanisms from the machine learning literature, to help identify a combination of inductive biases that allow symbolic information processing to emerge naturally in neural networks. We believe that a compositional approach to AI, in terms of grounded symbol-like representations, is of fundamental importance for realizing human-level generalization, and we hope that this paper may contribute towards that goal as a reference and inspiration.

الحوسبة العصبية والتطورية الذكاء الاصطناعي التعلم الآلي

Adaptive Reinforcement Learning through Evolving Self-Modifying Neural Networks

189 - Samuel Schmidgall 2020

The adaptive learning capabilities seen in biological neural networks are largely a product of the self-modifying behavior emerging from online plastic changes in synaptic connectivity. Current methods in Reinforcement Learning (RL) only adjust to ne w interactions after reflection over a specified time interval, preventing the emergence of online adaptivity. Recent work addressing this by endowing artificial neural networks with neuromodulated plasticity have been shown to improve performance on simple RL tasks trained using backpropagation, but have yet to scale up to larger problems. Here we study the problem of meta-learning in a challenging quadruped domain, where each leg of the quadruped has a chance of becoming unusable, requiring the agent to adapt by continuing locomotion with the remaining limbs. Results demonstrate that agents evolved using self-modifying plastic networks are more capable of adapting to complex meta-learning learning tasks, even outperforming the same network updated using gradient-based algorithms while taking less time to train.

الحوسبة العصبية والتطورية الذكاء الاصطناعي التعلم الآلي

BSNN: Towards Faster and Better Conversion of Artificial Neural Networks to Spiking Neural Networks with Bistable Neurons

79 - Yang Li , Yi Zeng , Dongcheng Zhao 2021

The spiking neural network (SNN) computes and communicates information through discrete binary events. It is considered more biologically plausible and more energy-efficient than artificial neural networks (ANN) in emerging neuromorphic hardware. How ever, due to the discontinuous and non-differentiable characteristics, training SNN is a relatively challenging task. Recent work has achieved essential progress on an excellent performance by converting ANN to SNN. Due to the difference in information processing, the converted deep SNN usually suffers serious performance loss and large time delay. In this paper, we analyze the reasons for the performance loss and propose a novel bistable spiking neural network (BSNN) that addresses the problem of spikes of inactivated neurons (SIN) caused by the phase lead and phase lag. Also, when ResNet structure-based ANNs are converted, the information of output neurons is incomplete due to the rapid transmission of the shortcut path. We design synchronous neurons (SN) to help efficiently improve performance. Experimental results show that the proposed method only needs 1/4-1/10 of the time steps compared to previous work to achieve nearly lossless conversion. We demonstrate state-of-the-art ANN-SNN conversion for VGG16, ResNet20, and ResNet34 on challenging datasets including CIFAR-10 (95.16% top-1), CIFAR-100 (78.12% top-1), and ImageNet (72.64% top-1).

الحوسبة العصبية والتطورية الذكاء الاصطناعي

CTNN: Corticothalamic-inspired neural network

98 - Leendert A Remmelzwaal , Amit K Mishra , George F R Ellis 2019

Sensory predictions by the brain in all modalities take place as a result of bottom-up and top-down connections both in the neocortex and between the neocortex and the thalamus. The bottom-up connections in the cortex are responsible for learning, pa ttern recognition, and object classification, and have been widely modelled using artificial neural networks (ANNs). Here, we present a neural network architecture modelled on the top-down corticothalamic connections and the behaviour of the thalamus: a corticothalamic neural network (CTNN), consisting of an auto-encoder connected to a difference engine with a threshold. We demonstrate that the CTNN is input agnostic, multi-modal, robust during partial occlusion of one or more sensory inputs, and has significantly higher processing efficiency than other predictive coding models, proportional to the number of sequentially similar inputs in a sequence. This increased efficiency could be highly significant in more complex implementations of this architecture, where the predictive nature of the cortex will allow most of the incoming data to be discarded.

الحوسبة العصبية والتطورية الخلايا العصبية والإدراك