No Arabic abstract
We present an adaptation of the standard Grassberger-Proccacia (GP) algorithm for estimating the Correlation Dimension of a time series in a non subjective manner. The validity and accuracy of this approach is tested using different types of time series, such as, those from standard chaotic systems, pure white and colored noise and chaotic systems added with noise. The effectiveness of the scheme in analysing noisy time series, particularly those involving colored noise, is investigated. An interesting result we have obtained is that, for the same percentage of noise addition, data with colored noise is more distinguishable from the corresponding surrogates, than data with white noise. As examples for real life applications, analysis of data from an astrophysical X-ray object and human brain EEG, are presented.
Convergent Cross-Mapping (CCM) has shown high potential to perform causal inference in the absence of models. We assess the strengths and weaknesses of the method by varying coupling strength and noise levels in coupled logistic maps. We find that CCM fails to infer accurate coupling strength and even causality direction in synchronized time-series and in the presence of intermediate coupling. We find that the presence of noise deterministically reduces the level of cross-mapping fidelity, while the convergence rate exhibits higher levels of robustness. Finally, we propose that controlled noise injections in intermediate-to-strongly coupled systems could enable more accurate causal inferences. Given the inherent noisy nature of real-world systems, our findings enable a more accurate evaluation of CCM applicability and advance suggestions on how to overcome its weaknesses.
Multivariate time series with missing values are common in areas such as healthcare and finance, and have grown in number and complexity over the years. This raises the question whether deep learning methodologies can outperform classical data imputation methods in this domain. However, naive applications of deep learning fall short in giving reliable confidence estimates and lack interpretability. We propose a new deep sequential latent variable model for dimensionality reduction and data imputation. Our modeling assumption is simple and interpretable: the high dimensional time series has a lower-dimensional representation which evolves smoothly in time according to a Gaussian process. The non-linear dimensionality reduction in the presence of missing data is achieved using a VAE approach with a novel structured variational approximation. We demonstrate that our approach outperforms several classical and deep learning-based data imputation methods on high-dimensional data from the domains of computer vision and healthcare, while additionally improving the smoothness of the imputations and providing interpretable uncertainty estimates.
Analyzing data from paleoclimate archives such as tree rings or lake sediments offers the opportunity of inferring information on past climate variability. Often, such data sets are univariate and a proper reconstruction of the systems higher-dimensional phase space can be crucial for further analyses. In this study, we systematically compare the methods of time delay embedding and differential embedding for phase space reconstruction. Differential embedding relates the systems higher-dimensional coordinates to the derivatives of the measured time series. For implementation, this requires robust and efficient algorithms to estimate derivatives from noisy and possibly non-uniformly sampled data. For this purpose, we consider several approaches: (i) central differences adapted to irregular sampling, (ii) a generalized version of discrete Legendre coordinates and (iii) the concept of Moving Taylor Bayesian Regression. We evaluate the performance of differential and time delay embedding by studying two paradigmatic model systems - the Lorenz and the Rossler system. More precisely, we compare geometric properties of the reconstructed attractors to those of the original attractors by applying recurrence network analysis. Finally, we demonstrate the potential and the limitations of using the different phase space reconstruction methods in combination with windowed recurrence network analysis for inferring information about past climate variability. This is done by analyzing two well-studied paleoclimate data sets from Ecuador and Mexico. We find that studying the robustness of the results when varying the analysis parameters is an unavoidable step in order to make well-grounded statements on climate variability and to judge whether a data set is suitable for this kind of analysis.
It is demonstrated how to generate time series with tailored nonlinearities by inducing well- defined constraints on the Fourier phases. Correlations between the phase information of adjacent phases and (static and dynamic) measures of nonlinearities are established and their origin is explained. By applying a set of simple constraints on the phases of an originally linear and uncor- related Gaussian time series, the observed scaling behavior of the intensity distribution of empirical time series can be reproduced. The power law character of the intensity distributions being typical for e.g. turbulence and financial data can thus be explained in terms of phase correlations.
Natural and social multivariate systems are commonly studied through sets of simultaneous and time-spaced measurements of the observables that drive their dynamics, i.e., through sets of time series. Typically, this is done via hypothesis testing: the statistical properties of the empirical time series are tested against those expected under a suitable null hypothesis. This is a very challenging task in complex interacting systems, where statistical stability is often poor due to lack of stationarity and ergodicity. Here, we describe an unsupervised, data-driven framework to perform hypothesis testing in such situations. This consists of a statistical mechanical approach - analogous to the configuration model for networked systems - for ensembles of time series designed to preserve, on average, some of the statistical properties observed on an empirical set of time series. We showcase its possible applications with a case study on financial portfolio selection.