A modularity comparison of Long Short-Term Memory and Morphognosis neural networks

86 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Tom Portegys PhD

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Thomas E. Portegys

التعلم الآلي الحوسبة العصبية والتطورية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This study compares the modularity performance of two artificial neural network architectures: a Long Short-Term Memory (LSTM) recurrent network, and Morphognosis, a neural network based on a hierarchy of spatial and temporal contexts. Mazes are used to measure performance, defined as the ability to utilize independently learned mazes to solve mazes composed of them. A maze is a sequence of rooms connected by doors. The modular task is implemented as follows: at the beginning of the maze, an initial door choice forms a context that must be retained until the end of an intervening maze, where the same door must be chosen again to reach the goal. For testing, the door-association mazes and separately trained intervening mazes are presented together for the first time. While both neural networks perform well during training, the testing performance of Morphognosis is significantly better than LSTM on this modular task.

قيم البحث

87 - Xiangang Li , Xihong Wu 2016

Long short-term memory (LSTM) recurrent neural networks (RNNs) have been shown to give state-of-the-art performance on many speech recognition tasks, as they are able to provide the learned dynamically changing contextual window of all sequence histo ry. On the other hand, the convolutional neural networks (CNNs) have brought significant improvements to deep feed-forward neural networks (FFNNs), as they are able to better reduce spectral variation in the input signal. In this paper, a network architecture called as convolutional recurrent neural network (CRNN) is proposed by combining the CNN and LSTM RNN. In the proposed CRNNs, each speech frame, without adjacent context frames, is organized as a number of local feature patches along the frequency axis, and then a LSTM network is performed on each feature patch along the time axis. We train and compare FFNNs, LSTM RNNs and the proposed LSTM CRNNs at various number of configurations. Experimental results show that the LSTM CRNNs can exceed state-of-the-art speech recognition performance.

الحساب واللغة الحوسبة العصبية والتطورية

On the Long-Term Memory of Deep Recurrent Networks

174 - Yoav Levine , Or Sharir , Alon Ziv 2017

A key attribute that drives the unprecedented success of modern Recurrent Neural Networks (RNNs) on learning tasks which involve sequential data, is their ability to model intricate long-term temporal dependencies. However, a well established measure of RNNs long-term memory capacity is lacking, and thus formal understanding of the effect of depth on their ability to correlate data throughout time is limited. Specifically, existing depth efficiency results on convolutional networks do not suffice in order to account for the success of deep RNNs on data of varying lengths. In order to address this, we introduce a measure of the networks ability to support information flow across time, referred to as the Start-End separation rank, which reflects the distance of the function realized by the recurrent network from modeling no dependency between the beginning and end of the input sequence. We prove that deep recurrent networks support Start-End separation ranks which are combinatorially higher than those supported by their shallow counterparts. Thus, we establish that depth brings forth an overwhelming advantage in the ability of recurrent networks to model long-term dependencies, and provide an exemplar of quantifying this key attribute which may be readily extended to other RNN architectures of interest, e.g. variants of LSTM networks. We obtain our results by considering a class of recurrent networks referred to as Recurrent Arithmetic Circuits, which merge the hidden state with the input via the Multiplicative Integration operation, and empirically demonstrate the discussed phenomena on common RNNs. Finally, we employ the tool of quantum Tensor Networks to gain additional graphic insight regarding the complexity brought forth by depth in recurrent networks.

التعلم الآلي الحوسبة العصبية والتطورية

Constructing Long Short-Term Memory based Deep Recurrent Neural Networks for Large Vocabulary Speech Recognition

402 - Xiangang Li , Xihong Wu 2014

Long short-term memory (LSTM) based acoustic modeling methods have recently been shown to give state-of-the-art performance on some speech recognition tasks. To achieve a further performance improvement, in this research, deep extensions on LSTM are investigated considering that deep hierarchical model has turned out to be more efficient than a shallow one. Motivated by previous research on constructing deep recurrent neural networks (RNNs), alternative deep LSTM architectures are proposed and empirically evaluated on a large vocabulary conversational telephone speech recognition task. Meanwhile, regarding to multi-GPU devices, the training process for LSTM networks is introduced and discussed. Experimental results demonstrate that the deep LSTM networks benefit from the depth and yield the state-of-the-art performance on this task.

الحساب واللغة الحوسبة العصبية والتطورية

Malicious Requests Detection with Improved Bidirectional Long Short-term Memory Neural Networks

76 - Wenhao Li , Bincheng Zhang , Jiajie Zhang 2020

Detecting and intercepting malicious requests are one of the most widely used ways against attacks in the network security. Most existing detecting approaches, including matching blacklist characters and machine learning algorithms have all shown to be vulnerable to sophisticated attacks. To address the above issues, a more general and rigorous detection method is required. In this paper, we formulate the problem of detecting malicious requests as a temporal sequence classification problem, and propose a novel deep learning model namely Convolutional Neural Network-Bidirectional Long Short-term Memory-Convolutional Neural Network (CNN-BiLSTM-CNN). By connecting the shadow and deep feature maps of the convolutional layers, the malicious feature extracting ability is improved on more detailed functionality. Experimental results on HTTP dataset CSIC 2010 have demonstrated the effectiveness of the proposed method when compared with the state-of-the-arts.

التعلم الآلي بنية الشبكات والإنترنت

VLSTM: Very Long Short-Term Memory Networks for High-Frequency Trading

128 - Prakhar Ganesh , Puneet Rakheja 2018

Financial trading is at the forefront of time-series analysis, and has grown hand-in-hand with it. The advent of electronic trading has allowed complex machine learning solutions to enter the field of financial trading. Financial markets have both lo ng term and short term signals and thus a good predictive model in financial trading should be able to incorporate them together. One of the most sought after forms of electronic trading is high-frequency trading (HFT), typically known for microsecond sensitive changes, which results in a tremendous amount of data. LSTMs are one of the most capable variants of the RNN family that can handle long-term dependencies, but even they are not equipped to handle such long sequences of the order of thousands of data points like in HFT. We propose very-long short term memory networks, or VLSTMs, to deal with such extreme length sequences. We explore the importance of VLSTMs in the context of HFT. We compare our model on publicly available dataset and got a 3.14% increase in F1-score over the existing state-of-the-art time-series forecasting models. We also show that our model has great parallelization potential, which is essential for practical purposes when trading on such markets.

التعلم الآلي الإحصاء والتجارة والسوق الصغير التعلم الالي