No Arabic abstract
In order to obtain a model which can process sequential data related to machine translation and speech recognition faster and more accurately, we propose adopting Chrono Initializer as the initialization method of Minimal Gated Unit. We evaluated the method with two tasks: adding task and copy task. As a result of the experiment, the effectiveness of the proposed method was confirmed.
An accurate road surface friction prediction algorithm can enable intelligent transportation systems to share timely road surface condition to the public for increasing the safety of the road users. Previously, scholars developed multiple prediction models for forecasting road surface conditions using historical data. However, road surface condition data cannot be perfectly collected at every timestamp, e.g. the data collected by on-vehicle sensors may be influenced when vehicles cannot travel due to economic cost issue or weather issues. Such resulted missing values in the collected data can damage the effectiveness and accuracy of the existing prediction methods since they are assumed to have the input data with a fixed temporal resolution. This study proposed a road surface friction prediction model employing a Gated Recurrent Unit network-based decay mechanism (GRU-D) to handle the missing values. The evaluation results present that the proposed GRU-D networks outperform all baseline models. The impact of missing rate on predictive accuracy, learning efficiency and learned decay rate are analyzed as well. The findings can help improve the prediction accuracy and efficiency of forecasting road surface friction using historical data sets with missing values, therefore mitigating the impact of wet or icy road conditions on traffic safety.
Electronic health records (EHR) are characterized as non-stationary, heterogeneous, noisy, and sparse data; therefore, it is challenging to learn the regularities or patterns inherent within them. In particular, sparseness caused mostly by many missing values has attracted the attention of researchers, who have attempted to find a better use of all available samples for determining the solution of a primary target task through the defining a secondary imputation problem. Methodologically, existing methods, either deterministic or stochastic, have applied different assumptions to impute missing values. However, once the missing values are imputed, most existing methods do not consider the fidelity or confidence of the imputed values in the modeling of downstream tasks. Undoubtedly, an erroneous or improper imputation of missing variables can cause difficulties in modeling as well as a degraded performance. In this study, we present a novel variational recurrent network that (i) estimates the distribution of missing variables allowing to represent uncertainty in the imputed values, (ii) updates hidden states by explicitly applying fidelity based on a variance of the imputed values during a recurrence (i.e., uncertainty propagation over time), and (iii) predicts the possibility of in-hospital mortality. It is noteworthy that our model can conduct these procedures in a single stream and learn all network parameters jointly in an end-to-end manner. We validated the effectiveness of our method using the public datasets of MIMIC-III and PhysioNet challenge 2012 by comparing with and outperforming other state-of-the-art methods for mortality prediction considered in our experiments. In addition, we identified the behavior of the model that well represented the uncertainties for the imputed estimates, which indicated a high correlation between the calculated MAE and the uncertainty.
Diversity-based approaches have recently gained popularity as an alternative paradigm to performance-based policy search. A popular approach from this family, Quality-Diversity (QD), maintains a collection of high-performing policies separated in the diversity-metric space, defined based on policies rollout behaviours. When policies are parameterised as neural networks, i.e. Neuroevolution, QD tends to not scale well with parameter space dimensionality. Our hypothesis is that there exists a low-dimensional manifold embedded in the policy parameter space, containing a high density of diverse and feasible policies. We propose a novel approach to diversity-based policy search via Neuroevolution, that leverages learned latent representations of the policy parameters which capture the local structure of the data. Our approach iteratively collects policies according to the QD framework, in order to (i) build a collection of diverse policies, (ii) use it to learn a latent representation of the policy parameters, (iii) perform policy search in the learned latent space. We use the Jacobian of the inverse transformation (i.e.reconstruction function) to guide the search in the latent space. This ensures that the generated samples remain in the high-density regions of the original space, after reconstruction. We evaluate our contributions on three continuous control tasks in simulated environments, and compare to diversity-based baselines. The findings suggest that our approach yields a more efficient and robust policy search process.
We propose an approach for improving sequence modeling based on autoregressive normalizing flows. Each autoregressive transform, acting across time, serves as a moving frame of reference, removing temporal correlations, and simplifying the modeling of higher-level dynamics. This technique provides a simple, general-purpose method for improving sequence modeling, with connections to existing and classical techniques. We demonstrate the proposed approach both with standalone flow-based models and as a component within sequential latent variable models. Results are presented on three benchmark video datasets, where autoregressive flow-based dynamics improve log-likelihood performance over baseline models. Finally, we illustrate the decorrelation and improved generalization properties of using flow-based dynamics.
Many popular variants of graph neural networks (GNNs) that are capable of handling multi-relational graphs may suffer from vanishing gradients. In this work, we propose a novel GNN architecture based on the Gated Graph Neural Network with an improved ability to handle long-range dependencies in multi-relational graphs. An experimental analysis on different synthetic tasks demonstrates that the proposed architecture outperforms several popular GNN models.