No Arabic abstract
Models based on neural networks and machine learning are seeing a rise in popularity in space physics. In particular, the forecasting of geomagnetic indices with neural network models is becoming a popular field of study. These models are evaluated with metrics such as the root-mean-square error (RMSE) and Pearson correlation coefficient. However, these classical metrics sometimes fail to capture crucial behavior. To show where the classical metrics are lacking, we trained a neural network, using a long short-term memory network, to make a forecast of the disturbance storm time index at origin time $t$ with a forecasting horizon of 1 up to 6 hours, trained on OMNIWeb data. Inspection of the models results with the correlation coefficient and RMSE indicated a performance comparable to the latest publications. However, visual inspection showed that the predictions made by the neural network were behaving similarly to the persistence model. In this work, a new method is proposed to measure whether two time series are shifted in time with respect to each other, such as the persistence model output versus the observation. The new measure, based on Dynamical Time Warping, is capable of identifying results made by the persistence model and shows promising results in confirming the visual observations of the neural networks output. Finally, different methodologies for training the neural network are explored in order to remove the persistence behavior from the results.
Dynamic Time Warping (DTW) is widely used for temporal data processing. However, existing methods can neither learn the discriminative prototypes of different classes nor exploit such prototypes for further analysis. We propose Discriminative Prototype DTW (DP-DTW), a novel method to learn class-specific discriminative prototypes for temporal recognition tasks. DP-DTW shows superior performance compared to conventional DTWs on time series classification benchmarks. Combined with end-to-end deep learning, DP-DTW can handle challenging weakly supervised action segmentation problems and achieves state of the art results on standard benchmarks. Moreover, detailed reasoning on the input video is enabled by the learned action prototypes. Specifically, an action-based video summarization can be obtained by aligning the input sequence with action prototypes.
Machine learning models have had discernible achievements in a myriad of applications. However, most of these models are black-boxes, and it is obscure how the decisions are made by them. This makes the models unreliable and untrustworthy. To provide insights into the decision making processes of these models, a variety of traditional interpretable models have been proposed. Moreover, to generate more human-friendly explanations, recent work on interpretability tries to answer questions related to causality such as Why does this model makes such decisions? or Was it a specific feature that caused the decision made by the model?. In this work, models that aim to answer causal questions are referred to as causal interpretable models. The existing surveys have covered concepts and methodologies of traditional interpretability. In this work, we present a comprehensive survey on causal interpretable models from the aspects of the problems and methods. In addition, this survey provides in-depth insights into the existing evaluation metrics for measuring interpretability, which can help practitioners understand for what scenarios each evaluation metric is suitable.
Time series data analytics has been a problem of substantial interests for decades, and Dynamic Time Warping (DTW) has been the most widely adopted technique to measure dissimilarity between time series. A number of global-alignment kernels have since been proposed in the spirit of DTW to extend its use to kernel-based estimation method such as support vector machine. However, those kernels suffer from diagonal dominance of the Gram matrix and a quadratic complexity w.r.t. the sample size. In this work, we study a family of alignment-aware positive definite (p.d.) kernels, with its feature embedding given by a distribution of emph{Random Warping Series (RWS)}. The proposed kernel does not suffer from the issue of diagonal dominance while naturally enjoys a emph{Random Features} (RF) approximation, which reduces the computational complexity of existing DTW-based techniques from quadratic to linear in terms of both the number and the length of time-series. We also study the convergence of the RF approximation for the domain of time series of unbounded length. Our extensive experiments on 16 benchmark datasets demonstrate that RWS outperforms or matches state-of-the-art classification and clustering methods in both accuracy and computational time. Our code and data is available at { url{https://github.com/IBM/RandomWarpingSeries}}.
Motivated by the phenomenon that companies introduce new products to keep abreast with customers rapidly changing tastes, we consider a novel online learning setting where a profit-maximizing seller needs to learn customers preferences through offering recommendations, which may contain existing products and new products that are launched in the middle of a selling period. We propose a sequential multinomial logit (SMNL) model to characterize customers behavior when product recommendations are presented in tiers. For the offline version with known customers preferences, we propose a polynomial-time algorithm and characterize the properties of the optimal tiered product recommendation. For the online problem, we propose a learning algorithm and quantify its regret bound. Moreover, we extend the setting to incorporate a constraint which ensures every new product is learned to a given accuracy. Our results demonstrate the tier structure can be used to mitigate the risks associated with learning new products.
During the last decades there is a continuing international endeavor in developing realistic space weather prediction tools aiming to forecast the conditions on the Sun and in the interplanetary environment. These efforts have led to the need of developing appropriate metrics in order to assess the performance of those tools. Metrics are necessary for validating models, comparing different models and monitoring adjustments or improvements of a certain model over time. In this work, we introduce the Dynamic Time Warping (DTW) as an alternative way to validate models and, in particular, to quantify differences between observed and synthetic (modeled) time series for space weather purposes. We present the advantages and drawbacks of this method as well as applications on WIND observations and EUHFORIA modeled output at L1. We show that DTW is a useful tool that permits the evaluation of both the fast and slow solar wind. Its distinctive characteristic is that it warps sequences in time, aiming to align them with the minimum cost by using dynamic programming. It can be applied in two different ways for the evaluation of modeled solar wind time series. The first way calculates the so-called sequence similarity factor (SSF), a number that provides a quantification of how good the forecast is, compared to a best and a worst case prediction scenarios. The second way quantifies the time and amplitude differences between the points that are best matched between the two sequences. As a result, it can serve as a hybrid metric between continuous measurements (such as, e.g., the correlation coefficient) and point-by-point comparisons. We conclude that DTW is a promising technique for the assessment of solar wind profiles offering functions that other metrics do not, so that it can give at once the most complete evaluation profile of a model.