Shape and Time Distortion Loss for Training Deep Time Series Forecasting Models

93 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Vincent Le Guen

تاريخ النشر 2019

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف Vincent Le Guen

التعلم الالي التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This paper addresses the problem of time series forecasting for non-stationary signals and multiple future steps prediction. To handle this challenging task, we introduce DILATE (DIstortion Loss including shApe and TimE), a new objective function for training deep neural networks. DILATE aims at accurately predicting sudden changes, and explicitly incorporates two terms supporting precise shape and temporal change detection. We introduce a differentiable loss function suitable for training deep neural nets, and provide a custom back-prop implementation for speeding up optimization. We also introduce a variant of DILATE, which provides a smooth generalization of temporally-constrained Dynamic Time Warping (DTW). Experiments carried out on various non-stationary datasets reveal the very good behaviour of DILATE compared to models trained with the standard Mean Squared Error (MSE) loss function, and also to DTW and variants. DILATE is also agnostic to the choice of the model, and we highlight its benefit for training fully connected networks as well as specialized recurrent architectures, showing its capacity to improve over state-of-the-art trajectory forecasting approaches.

قيم البحث

اقرأ أيضاً

Time-series Scenario Forecasting

223 - Sriharsha Veeramachaneni 2012

Many applications require the ability to judge uncertainty of time-series forecasts. Uncertainty is often specified as point-wise error bars around a mean or median forecast. Due to temporal dependencies, such a method obscures some information. We w ould ideally have a way to query the posterior probability of the entire time-series given the predictive variables, or at a minimum, be able to draw samples from this distribution. We use a Bayesian dictionary learning algorithm to statistically generate an ensemble of forecasts. We show that the algorithm performs as well as a physics-based ensemble method for temperature forecasts for Houston. We conclude that the method shows promise for scenario forecasting where physics-based methods are absent.

التعلم الالي التعلم الآلي تطبيقات الإحصاء

Deep Video Prediction for Time Series Forecasting

90 - Zhen Zeng , Tucker Balch , Manuela Veloso 2021

Time series forecasting is essential for decision making in many domains. In this work, we address the challenge of predicting prices evolution among multiple potentially interacting financial assets. A solution to this problem has obvious importance for governments, banks, and investors. Statistical methods such as Auto Regressive Integrated Moving Average (ARIMA) are widely applied to these problems. In this paper, we propose to approach economic time series forecasting of multiple financial assets in a novel way via video prediction. Given past prices of multiple potentially interacting financial assets, we aim to predict the prices evolution in the future. Instead of treating the snapshot of prices at each time point as a vector, we spatially layout these prices in 2D as an image, such that we can harness the power of CNNs in learning a latent representation for these financial assets. Thus, the history of these prices becomes a sequence of images, and our goal becomes predicting future images. We build on a state-of-the-art video prediction method for forecasting future images. Our experiments involve the prediction task of the price evolution of nine financial assets traded in U.S. stock markets. The proposed method outperforms baselines including ARIMA, Prophet, and variations of the proposed method, demonstrating the benefits of harnessing the power of CNNs in the problem of economic time series forecasting.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي الاقتصاد القياسي

Deep Adaptive Input Normalization for Time Series Forecasting

278 - Nikolaos Passalis , Anastasios Tefas , Juho Kanniainen 2019

Deep Learning (DL) models can be used to tackle time series analysis tasks with great success. However, the performance of DL models can degenerate rapidly if the data are not appropriately normalized. This issue is even more apparent when DL is used for financial time series forecasting tasks, where the non-stationary and multimodal nature of the data pose significant challenges and severely affect the performance of DL models. In this work, a simple, yet effective, neural layer, that is capable of adaptively normalizing the input time series, while taking into account the distribution of the data, is proposed. The proposed layer is trained in an end-to-end fashion using back-propagation and leads to significant performance improvements compared to other evaluated normalization schemes. The proposed method differs from traditional normalization methods since it learns how to perform normalization for a given task instead of using a fixed normalization scheme. At the same time, it can be directly applied to any new time series without requiring re-training. The effectiveness of the proposed method is demonstrated using a large-scale limit order book dataset, as well as a load forecasting dataset.

المالية الحاسوبية التعلم الآلي

Generalisation in fully-connected neural networks for time series forecasting

78 - Anastasia Borovykh , Cornelis W. Oosterlee , Sander M. Bohte 2019

In this paper we study the generalization capabilities of fully-connected neural networks trained in the context of time series forecasting. Time series do not satisfy the typical assumption in statistical learning theory of the data being i.i.d. sam ples from some data-generating distribution. We use the input and weight Hessians, that is the smoothness of the learned function with respect to the input and the width of the minimum in weight space, to quantify a networks ability to generalize to unseen data. While such generalization metrics have been studied extensively in the i.i.d. setting of for example image recognition, here we empirically validate their use in the task of time series forecasting. Furthermore we discuss how one can control the generalization capability of the network by means of the training process using the learning rate, batch size and the number of training iterations as controls. Using these hyperparameters one can efficiently control the complexity of the output function without imposing explicit constraints.

التعلم الالي التعلم الآلي

Think Globally, Act Locally: A Deep Neural Network Approach to High-Dimensional Time Series Forecasting

97 - Rajat Sen , Hsiang-Fu Yu , Inderjit Dhillon 2019

Forecasting high-dimensional time series plays a crucial role in many applications such as demand forecasting and financial predictions. Modern datasets can have millions of correlated time-series that evolve together, i.e they are extremely high dim ensional (one dimension for each individual time-series). There is a need for exploiting global patterns and coupling them with local calibration for better prediction. However, most recent deep learning approaches in the literature are one-dimensional, i.e, even though they are trained on the whole dataset, during prediction, the future forecast for a single dimension mainly depends on past values from the same dimension. In this paper, we seek to correct this deficiency and propose DeepGLO, a deep forecasting model which thinks globally and acts locally. In particular, DeepGLO is a hybrid model that combines a global matrix factorization model regularized by a temporal convolution network, along with another temporal network that can capture local properties of each time-series and associated covariates. Our model can be trained effectively on high-dimensional but diverse time series, where different time series can have vastly different scales, without a priori normalization or rescaling. Empirical results demonstrate that DeepGLO can outperform state-of-the-art approaches; for example, we see more than 25% improvement in WAPE over other methods on a public dataset that contains more than 100K-dimensional time series.

التعلم الالي التعلم الآلي