ترغب بنشر مسار تعليمي؟ اضغط هنا

EleAtt-RNN: Adding Attentiveness to Neurons in Recurrent Neural Networks

78   0   0.0 ( 0 )
 نشر من قبل Pengfei Zhang
 تاريخ النشر 2019
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Recurrent neural networks (RNNs) are capable of modeling temporal dependencies of complex sequential data. In general, current available structures of RNNs tend to concentrate on controlling the contributions of current and previous information. However, the exploration of different importance levels of different elements within an input vector is always ignored. We propose a simple yet effective Element-wise-Attention Gate (EleAttG), which can be easily added to an RNN block (e.g. all RNN neurons in an RNN layer), to empower the RNN neurons to have attentiveness capability. For an RNN block, an EleAttG is used for adaptively modulating the input by assigning different levels of importance, i.e., attention, to each element/dimension of the input. We refer to an RNN block equipped with an EleAttG as an EleAtt-RNN block. Instead of modulating the input as a whole, the EleAttG modulates the input at fine granularity, i.e., element-wise, and the modulation is content adaptive. The proposed EleAttG, as an additional fundamental unit, is general and can be applied to any RNN structures, e.g., standard RNN, Long Short-Term Memory (LSTM), or Gated Recurrent Unit (GRU). We demonstrate the effectiveness of the proposed EleAtt-RNN by applying it to different tasks including the action recognition, from both skeleton-based data and RGB videos, gesture recognition, and sequential MNIST classification. Experiments show that adding attentiveness through EleAttGs to RNN blocks significantly improves the power of RNNs.



قيم البحث

اقرأ أيضاً

Recurrent neural networks (RNNs) are capable of modeling the temporal dynamics of complex sequential information. However, the structures of existing RNN neurons mainly focus on controlling the contributions of current and historical information but do not explore the different importance levels of different elements in an input vector of a time slot. We propose adding a simple yet effective Element-wiseAttention Gate (EleAttG) to an RNN block (e.g., all RNN neurons in a network layer) that empowers the RNN neurons to have the attentiveness capability. For an RNN block, an EleAttG is added to adaptively modulate the input by assigning different levels of importance, i.e., attention, to each element/dimension of the input. We refer to an RNN block equipped with an EleAttG as an EleAtt-RNN block. Specifically, the modulation of the input is content adaptive and is performed at fine granularity, being element-wise rather than input-wise. The proposed EleAttG, as an additional fundamental unit, is general and can be applied to any RNN structures, e.g., standard RNN, Long Short-Term Memory (LSTM), or Gated Recurrent Unit (GRU). We demonstrate the effectiveness of the proposed EleAtt-RNN by applying it to the action recognition tasks on both 3D human skeleton data and RGB videos. Experiments show that adding attentiveness through EleAttGs to RNN blocks significantly boosts the power of RNNs.
With the rapid development of natural language processing technologies, more and more text steganographic methods based on automatic text generation technology have appeared in recent years. These models use the powerful self-learning and feature ext raction ability of the neural networks to learn the feature expression of massive normal texts. Then they can automatically generate dense steganographic texts which conform to such statistical distribution based on the learned statistical patterns. In this paper, we observe that the conditional probability distribution of each word in the automatically generated steganographic texts will be distorted after embedded with hidden information. We use Recurrent Neural Networks (RNNs) to extract these feature distribution differences and then classify those features into cover text and stego text categories. Experimental results show that the proposed model can achieve high detection accuracy. Besides, the proposed model can even make use of the subtle differences of the feature distribution of texts to estimate the amount of hidden information embedded in the generated steganographic text.
Solving the visual symbol grounding problem has long been a goal of artificial intelligence. The field appears to be advancing closer to this goal with recent breakthroughs in deep learning for natural language grounding in static images. In this pap er, we propose to translate videos directly to sentences using a unified deep neural network with both convolutional and recurrent structure. Described video datasets are scarce, and most existing methods have been applied to toy domains with a small vocabulary of possible words. By transferring knowledge from 1.2M+ images with category labels and 100,000+ images with captions, our method is able to create sentence descriptions of open-domain videos with large vocabularies. We compare our approach with recent work using language generation metrics, subject, verb, and object prediction accuracy, and a human evaluation.
Magnetic Resonance Fingerprinting (MRF) is a new approach to quantitative magnetic resonance imaging that allows simultaneous measurement of multiple tissue properties in a single, time-efficient acquisition. Standard MRF reconstructs parametric maps using dictionary matching and lacks scalability due to computational inefficiency. We propose to perform MRF map reconstruction using a recurrent neural network, which exploits the time-dependent information of the MRF signal evolution. We evaluate our method on multiparametric synthetic signals and compare it to existing MRF map reconstruction approaches, including those based on neural networks. Our method achieves state-of-the-art estimates of T1 and T2 values. In addition, the reconstruction time is significantly reduced compared to dictionary-matching based approaches.
123 - Till S. Hartmann 2018
Classical convolutional neural networks (cCNNs) are very good at categorizing objects in images. But, unlike human vision which is relatively robust to noise in images, the performance of cCNNs declines quickly as image quality worsens. Here we propo se to use recurrent connections within the convolutional layers to make networks robust against pixel noise such as could arise from imaging at low light levels, and thereby significantly increase their performance when tested with simulated noisy video sequences. We show that cCNNs classify images with high signal to noise ratios (SNRs) well, but are easily outperformed when tested with low SNR images (high noise levels) by convolutional neural networks that have recurrency added to convolutional layers, henceforth referred to as gruCNNs. Addition of Bayes-optimal temporal integration to allow the cCNN to integrate multiple image frames still does not match gruCNN performance. Additionally, we show that at low SNRs, the probabilities predicted by the gruCNN (after calibration) have higher confidence than those predicted by the cCNN. We propose to consider recurrent connections in the early stages of neural networks as a solution to computer vision under imperfect lighting conditions and noisy environments; challenges faced during real-time video streams of autonomous driving at night, during rain or snow, and other non-ideal situations.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا