No Arabic abstract
Suppose there is a large file which should be transmitted (or stored) and there are several (say, m) admissible data-compressors. It seems natural to try all the compressors and then choose the best, i.e. the one that gives the shortest compressed file. Then transfer (or store) the index number of the best compressor (it requires log m bits) and the compressed file.The only problem is the time, which essentially increases due to the need to compress the file m times (in order to find the best compressor). We propose a method that encodes the file with the optimal compressor, but uses a relatively small additional time: the ratio of this extra time and the total time of calculation can be limited by an arbitrary positive constant. Generally speaking, in many situations it may be necessary find the best data compressor out of a given set, which is often done by comparing them empirically. One of the goals of this work is to turn such a selection process into a part of the data compression method, automating and optimizing it.
In this study we show that standard well-known file compression programs (zlib, bzip2, etc.) are able to forecast real-world time series data well. The strength of our approach is its ability to use a set of data compression algorithms and automatically choose the best one of them during the process of forecasting. Besides, modern data-compressors are able to find many kinds of latent regularities using some methods of artificial intelligence (for example, some data-compressors are based on finding the smallest formal grammar that describes the time series). Thus, our approach makes it possible to apply some particular methods of artificial intelligence for time-series forecasting. As examples of the application of the proposed method, we made forecasts for the monthly T-index and the Kp-index time series using standard compressors. In both cases, we used the Mean Absolute Error (MAE) as an accuracy measure. For the monthly T-index time series, we made 18 forecasts beyond the available data for each month since January 2011 to July 2017. We show that, in comparison with the forecasts made by the Australian Bureau of Meteorology, our method more accurately predicts one value ahead. The Kp-index time series consists of 3-hour values ranging from 0 to 9. For each day from February 4, 2018 to March 28, 2018, we made forecasts for 24 values ahead. We compared our forecasts with the forecasts made by the Space Weather Prediction Center (SWPC). The results showed that the accuracy of our method is similar to the accuracy of the SWPCs method. As in the previous case, we also obtained more accurate one-step forecasts.
This paper provides an extensive study of the behavior of the best achievable rate (and other related fundamental limits) in variable-length lossless compression. In the non-asymptotic regime, the fundamental limits of fixed-to-variable lossless compression with and without prefix constraints are shown to be tightly coupled. Several precise, quantitative bounds are derived, connecting the distribution of the optimal codelengths to the source information spectrum, and an exact analysis of the best achievable rate for arbitrary sources is given. Fine asymptotic results are proved for arbitrary (not necessarily prefix) compressors on general mixing sources. Non-asymptotic, explicit Gaussian approximation bounds are established for the best achievable rate on Markov sources. The source dispersion and the source varentropy rate are defined and characterized. Together with the entropy rate, the varentropy rate serves to tightly approximate the fundamental non-asymptotic limits of fixed-to-variable compression for all but very small blocklengths.
In this article we study lossless compression of strings of pure quantum states of indeterminate-length quantum codes which were introduced by Schumacher and Westmoreland. Past work has assumed that the strings of quantum data are prepared to be encoded in an independent and identically distributed way. We introduce the notion of quantum stochastic ensembles, allowing us to consider strings of quantum states prepared in a more general way. For any identically distributed quantum stochastic ensemble we define an associated quantum Markov chain and prove that the optimal average codeword length via lossless coding is equal to the quantum dynamical entropy of the associated quantum Markov chain.
According to Kolmogorov complexity, every finite binary string is compressible to a shortest code -- its information content -- from which it is effectively recoverable. We investigate the extent to which this holds for infinite binary sequences (streams). We devise a new coding method which uniformly codes every stream $X$ into an algorithmically random stream $Y$, in such a way that the first $n$ bits of $X$ are recoverable from the first $I(Xupharpoonright_n)$ bits of $Y$, where $I$ is any partial computable information content measure which is defined on all prefixes of $X$, and where $Xupharpoonright_n$ is the initial segment of $X$ of length $n$. As a consequence, if $g$ is any computable upper bound on the initial segment prefix-free complexity of $X$, then $X$ is computable from an algorithmically random $Y$ with oracle-use at most $g$. Alternatively (making no use of such a computable bound $g$) one can achieve an oracle-use bounded above by $K(Xupharpoonright_n)+log n$. This provides a strong analogue of Shannons source coding theorem for algorithmic information theory.
Receiver diversity combining methods play a key role in combating the detrimental effects of fadingin wireless communication and other applications. A novel diversity combining method is proposedwhere a universal, i.e., channel independent, orthogonal dimension-reducing space-time transformationis applied prior to quantization of the signals. The scheme may be considered as the counterpart ofAlamouti modulation, and more generally of orthogonal space-time block codes.