No Arabic abstract
With the rising number of interconnected devices and sensors, modeling distributed sensor networks is of increasing interest. Recurrent neural networks (RNN) are considered particularly well suited for modeling sensory and streaming data. When predicting future behavior, incorporating information from neighboring sensor stations is often beneficial. We propose a new RNN based architecture for context specific information fusion across multiple spatially distributed sensor stations. Hereby, latent representations of multiple local models, each modeling one sensor station, are jointed and weighted, according to their importance for the prediction. The particular importance is assessed depending on the current context using a separate attention function. We demonstrate the effectiveness of our model on three different real-world sensor network datasets.
Considering deep sequence learning for practical application, two representative RNNs - LSTM and GRU may come to mind first. Nevertheless, is there no chance for other RNNs? Will there be a better RNN in the future? In this work, we propose a novel, succinct and promising RNN - Fusion Recurrent Neural Network (Fusion RNN). Fusion RNN is composed of Fusion module and Transport module every time step. Fusion module realizes the multi-round fusion of the input and hidden state vector. Transport module which mainly refers to simple recurrent network calculate the hidden state and prepare to pass it to the next time step. Furthermore, in order to evaluate Fusion RNNs sequence feature extraction capability, we choose a representative data mining task for sequence data, estimated time of arrival (ETA) and present a novel model based on Fusion RNN. We contrast our method and other variants of RNN for ETA under massive vehicle travel data from DiDi Chuxing. The results demonstrate that for ETA, Fusion RNN is comparable to state-of-the-art LSTM and GRU which are more complicated than Fusion RNN.
In this paper, we consider high-dimensional stationary processes where a new observation is generated from a compressed version of past observations. The specific evolution is modeled by an encoder-decoder structure. We estimate the evolution with an encoder-decoder neural network and give upper bounds for the expected forecast error under specific structural and sparsity assumptions. The results are shown separately for conditions either on the absolutely regular mixing coefficients or the functional dependence measure of the observed process. In a quantitative simulation we discuss the behavior of the network estimator under different model assumptions. We corroborate our theory by a real data example where we consider forecasting temperature data.
A simple yet effective architectural design of radial basis function neural networks (RBFNN) makes them amongst the most popular conventional neural networks. The current generation of radial basis function neural network is equipped with multiple kernels which provide significant performance benefits compared to the previous generation using only a single kernel. In existing multi-kernel RBF algorithms, multi-kernel is formed by the convex combination of the base/primary kernels. In this paper, we propose a novel multi-kernel RBFNN in which every base kernel has its own (local) weight. This novel flexibility in the network provides better performance such as faster convergence rate, better local minima and resilience against stucking in poor local minima. These performance gains are achieved at a competitive computational complexity compared to the contemporary multi-kernel RBF algorithms. The proposed algorithm is thoroughly analysed for performance gain using mathematical and graphical illustrations and also evaluated on three different types of problems namely: (i) pattern classification, (ii) system identification and (iii) function approximation. Empirical results clearly show the superiority of the proposed algorithm compared to the existing state-of-the-art multi-kernel approaches.
Process Mining consists of techniques where logs created by operative systems are transformed into process models. In process mining tools it is often desired to be able to classify ongoing process instances, e.g., to predict how long the process will still require to complete, or to classify process instances to different classes based only on the activities that have occurred in the process instance thus far. Recurrent neural networks and its subclasses, such as Gated Recurrent Unit (GRU) and Long Short-Term Memory (LSTM), have been demonstrated to be able to learn relevant temporal features for subsequent classification tasks. In this paper we apply recurrent neural networks to classifying process instances. The proposed model is trained in a supervised fashion using labeled process instances extracted from event log traces. This is the first time we know of GRU having been used in classifying business process instances. Our main experimental results shows that GRU outperforms LSTM remarkably in training time while giving almost identical accuracies to LSTM models. Additional contributions of our paper are improving the classification model training time by filtering infrequent activities, which is a technique commonly used, e.g., in Natural Language Processing (NLP).
Origin-destination (OD) matrices are often used in urban planning, where a city is partitioned into regions and an element (i, j) in an OD matrix records the cost (e.g., travel time, fuel consumption, or travel speed) from region i to region j. In this paper, we partition a day into multiple intervals, e.g., 96 15-min intervals and each interval is associated with an OD matrix which represents the costs in the interval; and we consider sparse and stochastic OD matrices, where the elements represent stochastic but not deterministic costs and some elements are missing due to lack of data between two regions. We solve the sparse, stochastic OD matrix forecasting problem. Given a sequence of historical OD matrices that are sparse, we aim at predicting future OD matrices with no empty elements. We propose a generic learning framework to solve the problem by dealing with sparse matrices via matrix factorization and two graph convolutional neural networks and capturing temporal dynamics via recurrent neural network. Empirical studies using two taxi datasets from different countries verify the effectiveness of the proposed framework.