Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Application of data compression techniques to time series forecasting

58 0 0.0 ( 0 )

Download Cite

Added by Boris Ryabko

Publication date 2019

fields Informatics Engineering

and research's language is English

Authors K.S. Chirikhin - B.Ya. Ryabko

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In this study we show that standard well-known file compression programs (zlib, bzip2, etc.) are able to forecast real-world time series data well. The strength of our approach is its ability to use a set of data compression algorithms and automatically choose the best one of them during the process of forecasting. Besides, modern data-compressors are able to find many kinds of latent regularities using some methods of artificial intelligence (for example, some data-compressors are based on finding the smallest formal grammar that describes the time series). Thus, our approach makes it possible to apply some particular methods of artificial intelligence for time-series forecasting. As examples of the application of the proposed method, we made forecasts for the monthly T-index and the Kp-index time series using standard compressors. In both cases, we used the Mean Absolute Error (MAE) as an accuracy measure. For the monthly T-index time series, we made 18 forecasts beyond the available data for each month since January 2011 to July 2017. We show that, in comparison with the forecasts made by the Australian Bureau of Meteorology, our method more accurately predicts one value ahead. The Kp-index time series consists of 3-hour values ranging from 0 to 9. For each day from February 4, 2018 to March 28, 2018, we made forecasts for 24 values ahead. We compared our forecasts with the forecasts made by the Space Weather Prediction Center (SWPC). The results showed that the accuracy of our method is similar to the accuracy of the SWPCs method. As in the previous case, we also obtained more accurate one-step forecasts.

rate research

Time-series Scenario Forecasting

371 - Sriharsha Veeramachaneni 2012

Many applications require the ability to judge uncertainty of time-series forecasts. Uncertainty is often specified as point-wise error bars around a mean or median forecast. Due to temporal dependencies, such a method obscures some information. We would ideally have a way to query the posterior probability of the entire time-series given the predictive variables, or at a minimum, be able to draw samples from this distribution. We use a Bayesian dictionary learning algorithm to statistically generate an ensemble of forecasts. We show that the algorithm performs as well as a physics-based ensemble method for temperature forecasts for Houston. We conclude that the method shows promise for scenario forecasting where physics-based methods are absent.

Machine Learning Machine Learning Applications

Time-universal data compression and prediction

150 - Boris Ryabko 2018

Suppose there is a large file which should be transmitted (or stored) and there are several (say, m) admissible data-compressors. It seems natural to try all the compressors and then choose the best, i.e. the one that gives the shortest compressed file. Then transfer (or store) the index number of the best compressor (it requires log m bits) and the compressed file.The only problem is the time, which essentially increases due to the need to compress the file m times (in order to find the best compressor). We propose a method that encodes the file with the optimal compressor, but uses a relatively small additional time: the ratio of this extra time and the total time of calculation can be limited by an arbitrary positive constant. Generally speaking, in many situations it may be necessary find the best data compressor out of a given set, which is often done by comparing them empirically. One of the goals of this work is to turn such a selection process into a part of the data compression method, automating and optimizing it.

Information Theory Information Theory

Compression of data streams down to their information content

59 - George Barmpalias , Andrew Lewis-Pye 2017

According to Kolmogorov complexity, every finite binary string is compressible to a shortest code -- its information content -- from which it is effectively recoverable. We investigate the extent to which this holds for infinite binary sequences (streams). We devise a new coding method which uniformly codes every stream $X$ into an algorithmically random stream $Y$, in such a way that the first $n$ bits of $X$ are recoverable from the first $I(Xupharpoonright_n)$ bits of $Y$, where $I$ is any partial computable information content measure which is defined on all prefixes of $X$, and where $Xupharpoonright_n$ is the initial segment of $X$ of length $n$. As a consequence, if $g$ is any computable upper bound on the initial segment prefix-free complexity of $X$, then $X$ is computable from an algorithmically random $Y$ with oracle-use at most $g$. Alternatively (making no use of such a computable bound $g$) one can achieve an oracle-use bounded above by $K(Xupharpoonright_n)+log n$. This provides a strong analogue of Shannons source coding theorem for algorithmic information theory.

Information Theory Information Theory

Two-Stage Framework for Seasonal Time Series Forecasting

186 - Qingyang Xu , Qingsong Wen , Liang Sun 2021

Seasonal time series Forecasting remains a challenging problem due to the long-term dependency from seasonality. In this paper, we propose a two-stage framework to forecast univariate seasonal time series. The first stage explicitly learns the long-range time series structure in a time window beyond the forecast horizon. By incorporating the learned long-range structure, the second stage can enhance the prediction accuracy in the forecast horizon. In both stages, we integrate the auto-regressive model with neural networks to capture both linear and non-linear characteristics in time series. Our framework achieves state-of-the-art performance on M4 Competition Hourly datasets. In particular, we show that incorporating the intermediate results generated in the first stage to existing forecast models can effectively enhance their prediction performance.

Machine Learning Artificial Intelligence Applications

Lossless Data Compression at Finite Blocklengths

340 - Ioannis Kontoyiannis , Sergio Verdu 2012

This paper provides an extensive study of the behavior of the best achievable rate (and other related fundamental limits) in variable-length lossless compression. In the non-asymptotic regime, the fundamental limits of fixed-to-variable lossless compression with and without prefix constraints are shown to be tightly coupled. Several precise, quantitative bounds are derived, connecting the distribution of the optimal codelengths to the source information spectrum, and an exact analysis of the best achievable rate for arbitrary sources is given. Fine asymptotic results are proved for arbitrary (not necessarily prefix) compressors on general mixing sources. Non-asymptotic, explicit Gaussian approximation bounds are established for the best achievable rate on Markov sources. The source dispersion and the source varentropy rate are defined and characterized. Together with the entropy rate, the varentropy rate serves to tightly approximate the fundamental non-asymptotic limits of fixed-to-variable compression for all but very small blocklengths.

Information Theory Information Theory Probability

comments

Fetching comments

Higher Institute for Applied Sciences and Technology

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Application of data compression techniques to time series forecasting

Ask ChatGPT about the research

No Arabic abstract

Read More