Do you want to publish a course? Click here

Neural Function Modules with Sparse Arguments: A Dynamic Approach to Integrating Information across Layers

123   0   0.0 ( 0 )
 Publication date 2020
and research's language is English




Ask ChatGPT about the research

Feed-forward neural networks consist of a sequence of layers, in which each layer performs some processing on the information from the previous layer. A downside to this approach is that each layer (or module, as multiple modules can operate in parallel) is tasked with processing the entire hidden state, rather than a particular part of the state which is most relevant for that module. Methods which only operate on a small number of input variables are an essential part of most programming languages, and they allow for improved modularity and code re-usability. Our proposed method, Neural Function Modules (NFM), aims to introduce the same structural capability into deep learning. Most of the work in the context of feed-forward networks combining top-down and bottom-up feedback is limited to classification problems. The key contribution of our work is to combine attention, sparsity, top-down and bottom-up feedback, in a flexible algorithm which, as we show, improves the results in standard classification, out-of-domain generalization, generative modeling, and learning representations in the context of reinforcement learning.



rate research

Read More

66 - Junjie Liu , Zhe Xu , Runbin Shi 2020
We present a novel network pruning algorithm called Dynamic Sparse Training that can jointly find the optimal network parameters and sparse network structure in a unified optimization process with trainable pruning thresholds. These thresholds can have fine-grained layer-wise adjustments dynamically via backpropagation. We demonstrate that our dynamic sparse training algorithm can easily train very sparse neural network models with little performance loss using the same number of training epochs as dense models. Dynamic Sparse Training achieves the state of the art performance compared with other sparse training algorithms on various network architectures. Additionally, we have several surprising observations that provide strong evidence for the effectiveness and efficiency of our algorithm. These observations reveal the underlying problems of traditional three-stage pruning algorithms and present the potential guidance provided by our algorithm to the design of more compact network architectures.
We advocate the use of a notion of entropy that reflects the relative abundances of the symbols in an alphabet, as well as the similarities between them. This concept was originally introduced in theoretical ecology to study the diversity of ecosystems. Based on this notion of entropy, we introduce geometry-aware counterparts for several concepts and theorems in information theory. Notably, our proposed divergence exhibits performance on par with state-of-the-art methods based on the Wasserstein distance, but enjoys a closed-form expression that can be computed efficiently. We demonstrate the versatility of our method via experiments on a broad range of domains: training generative models, computing image barycenters, approximating empirical measures and counting modes.
We propose ThalNet, a deep learning model inspired by neocortical communication via the thalamus. Our model consists of recurrent neural modules that send features through a routing center, endowing the modules with the flexibility to share features over multiple time steps. We show that our model learns to route information hierarchically, processing input data by a chain of modules. We observe common architectures, such as feed forward neural networks and skip connections, emerging as special cases of our architecture, while novel connectivity patterns are learned for the text8 compression task. Our model outperforms standard recurrent neural networks on several sequential benchmarks.
In this paper, we consider networks of deterministic spiking neurons, firing synchronously at discrete times; such spiking neural networks are inspired by networks of neurons and synapses that occur in brains. We consider the problem of translating temporal information into spatial information in such networks, an important task that is carried out by actual brains. Specifically, we define two problems: First Consecutive Spikes Counting (FCSC) and Total Spikes Counting (TSC), which model spike and rate coding aspects of translating temporal information into spatial information respectively. Assuming an upper bound of $T$ on the length of the temporal input signal, we design two networks that solve these two problems, each using $O(log T)$ neurons and terminating in time $1$. We also prove that there is no network with less than $T$ neurons that solves either question in time $0$.
126 - Weijun Luo 2020
Neural network forms the foundation of deep learning and numerous AI applications. Classical neural networks are fully connected, expensive to train and prone to overfitting. Sparse networks tend to have convoluted structure search, suboptimal performance and limited usage. We proposed the novel uniform sparse network (USN) with even and sparse connectivity within each layer. USN has one striking property that its performance is independent of the substantial topology variation and enormous model space, thus offers a search-free solution to all above mentioned issues of neural networks. USN consistently and substantially outperforms the state-of-the-art sparse network models in prediction accuracy, speed and robustness. It even achieves higher prediction accuracy than the fully connected network with only 0.55% parameters and 1/4 computing time and resources. Importantly, USN is conceptually simple as a natural generalization of fully connected network with multiple improvements in accuracy, robustness and scalability. USN can replace the latter in a range of applications, data types and deep learning architectures. We have made USN open source at https://github.com/datapplab/sparsenet.

suggested questions

comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا