Neural Function Modules with Sparse Arguments: A Dynamic Approach to Integrating Information across Layers

123 0 0.0 ( 0 )

Download Cite

Added by Agnieszka Maria S{\\l}owik

Publication date 2020

fields Informatics Engineering Mathematical Statistics

and research's language is English

Authors Alex Lamb - Anirudh Goyal - Agnieszka S{l}owik

Machine Learning Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Feed-forward neural networks consist of a sequence of layers, in which each layer performs some processing on the information from the previous layer. A downside to this approach is that each layer (or module, as multiple modules can operate in parallel) is tasked with processing the entire hidden state, rather than a particular part of the state which is most relevant for that module. Methods which only operate on a small number of input variables are an essential part of most programming languages, and they allow for improved modularity and code re-usability. Our proposed method, Neural Function Modules (NFM), aims to introduce the same structural capability into deep learning. Most of the work in the context of feed-forward networks combining top-down and bottom-up feedback is limited to classification problems. The key contribution of our work is to combine attention, sparsity, top-down and bottom-up feedback, in a flexible algorithm which, as we show, improves the results in standard classification, out-of-domain generalization, generative modeling, and learning representations in the context of reinforcement learning.

rate research

Dynamic Sparse Training: Find Efficient Sparse Network From Scratch With Trainable Masked Layers

66 - Junjie Liu , Zhe Xu , Runbin Shi 2020

We present a novel network pruning algorithm called Dynamic Sparse Training that can jointly find the optimal network parameters and sparse network structure in a unified optimization process with trainable pruning thresholds. These thresholds can have fine-grained layer-wise adjustments dynamically via backpropagation. We demonstrate that our dynamic sparse training algorithm can easily train very sparse neural network models with little performance loss using the same number of training epochs as dense models. Dynamic Sparse Training achieves the state of the art performance compared with other sparse training algorithms on various network architectures. Additionally, we have several surprising observations that provide strong evidence for the effectiveness and efficiency of our algorithm. These observations reveal the underlying problems of traditional three-stage pruning algorithms and present the potential guidance provided by our algorithm to the design of more compact network architectures.

Machine Learning Machine Learning

GAIT: A Geometric Approach to Information Theory

152 - Jose Gallego , Ankit Vani , Max Schwarzer 2019

We advocate the use of a notion of entropy that reflects the relative abundances of the symbols in an alphabet, as well as the similarities between them. This concept was originally introduced in theoretical ecology to study the diversity of ecosystems. Based on this notion of entropy, we introduce geometry-aware counterparts for several concepts and theorems in information theory. Notably, our proposed divergence exhibits performance on par with state-of-the-art methods based on the Wasserstein distance, but enjoys a closed-form expression that can be computed efficiently. We demonstrate the versatility of our method via experiments on a broad range of domains: training generative models, computing image barycenters, approximating empirical measures and counting modes.

Machine Learning Machine Learning

Learning Hierarchical Information Flow with Recurrent Neural Modules

70 - Danijar Hafner , Alex Irpan , James Davidson 2017

We propose ThalNet, a deep learning model inspired by neocortical communication via the thalamus. Our model consists of recurrent neural modules that send features through a routing center, endowing the modules with the flexibility to share features over multiple time steps. We show that our model learns to route information hierarchically, processing input data by a chain of modules. We observe common architectures, such as feed forward neural networks and skip connections, emerging as special cases of our architecture, while novel connectivity patterns are learned for the text8 compression task. Our model outperforms standard recurrent neural networks on several sequential benchmarks.

Machine Learning Artificial Intelligence

Integrating Temporal Information to Spatial Information in a Neural Circuit

120 - Nancy Lynch , Mien Brabeeba Wang 2019

In this paper, we consider networks of deterministic spiking neurons, firing synchronously at discrete times; such spiking neural networks are inspired by networks of neurons and synapses that occur in brains. We consider the problem of translating temporal information into spatial information in such networks, an important task that is carried out by actual brains. Specifically, we define two problems: First Consecutive Spikes Counting (FCSC) and Total Spikes Counting (TSC), which model spike and rate coding aspects of translating temporal information into spatial information respectively. Assuming an upper bound of $T$ on the length of the temporal input signal, we design two networks that solve these two problems, each using $O(log T)$ neurons and terminating in time $1$. We also prove that there is no network with less than $T$ neurons that solves either question in time $0$.

Distributed Parallel and Cluster Computing Neural and Evolutionary Computing

Improving Neural Network with Uniform Sparse Connectivity

126 - Weijun Luo 2020

Neural network forms the foundation of deep learning and numerous AI applications. Classical neural networks are fully connected, expensive to train and prone to overfitting. Sparse networks tend to have convoluted structure search, suboptimal performance and limited usage. We proposed the novel uniform sparse network (USN) with even and sparse connectivity within each layer. USN has one striking property that its performance is independent of the substantial topology variation and enormous model space, thus offers a search-free solution to all above mentioned issues of neural networks. USN consistently and substantially outperforms the state-of-the-art sparse network models in prediction accuracy, speed and robustness. It even achieves higher prediction accuracy than the fully connected network with only 0.55% parameters and 1/4 computing time and resources. Importantly, USN is conceptually simple as a natural generalization of fully connected network with multiple improvements in accuracy, robustness and scalability. USN can replace the latter in a range of applications, data types and deep learning architectures. We have made USN open source at https://github.com/datapplab/sparsenet.

Machine Learning Machine Learning

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Neural Function Modules with Sparse Arguments: A Dynamic Approach to Integrating Information across Layers

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions