No Arabic abstract
Inferring nonlinear and asymmetric causal relationships between multivariate longitudinal data is a challenging task with wide-ranging application areas including clinical medicine, mathematical biology, economics and environmental research. A number of methods for inferring causal relationships within complex dynamic and stochastic systems have been proposed but there is not a unified consistent definition of causality in this context. We evaluate the performance of ten prominent bivariate causality indices for time series data, across four simulated model systems that have different coupling schemes and characteristics. In further experiments, we show that these methods may not always be invariant to real-world relevant transformations (data availability, standardisation and scaling, rounding error, missing data and noisy data). We recommend transfer entropy and nonlinear Granger causality as likely to be particularly robust indices for estimating bivariate causal relationships in real-world applications. Finally, we provide flexible open-access Python code for computation of these methods and for the model simulations.
We study causality between bivariate curve time series using the Granger causality generalized measures of correlation. With this measure, we can investigate which curve time series Granger-causes the other; in turn, it helps determine the predictability of any two curve time series. Illustrated by a climatology example, we find that the sea surface temperature Granger-causes the sea-level atmospheric pressure. Motivated by a portfolio management application in finance, we single out those stocks that lead or lag behind Dow-Jones industrial averages. Given a close relationship between S&P 500 index and crude oil price, we determine the leading and lagging variables.
Convergent Cross-Mapping (CCM) has shown high potential to perform causal inference in the absence of models. We assess the strengths and weaknesses of the method by varying coupling strength and noise levels in coupled logistic maps. We find that CCM fails to infer accurate coupling strength and even causality direction in synchronized time-series and in the presence of intermediate coupling. We find that the presence of noise deterministically reduces the level of cross-mapping fidelity, while the convergence rate exhibits higher levels of robustness. Finally, we propose that controlled noise injections in intermediate-to-strongly coupled systems could enable more accurate causal inferences. Given the inherent noisy nature of real-world systems, our findings enable a more accurate evaluation of CCM applicability and advance suggestions on how to overcome its weaknesses.
Introduced more than a half century ago, Granger causality has become a popular tool for analyzing time series data in many application domains, from economics and finance to genomics and neuroscience. Despite this popularity, the validity of this notion for inferring causal relationships among time series has remained the topic of continuous debate. Moreover, while the original definition was general, limitations in computational tools have primarily limited the applications of Granger causality to simple bivariate vector auto-regressive processes or pairwise relationships among a set of variables. Starting with a review of early developments and debates, this paper discusses recent advances that address various shortcomings of the earlier approaches, from models for high-dimensional time series to more recent developments that account for nonlinear and non-Gaussian observations and allow for sub-sampled and mixed frequency time series.
A nonparametric method to predict non-Markovian time series of partially observed dynamics is developed. The prediction problem we consider is a supervised learning task of finding a regression function that takes a delay embedded observable to the observable at a future time. When delay embedding theory is applicable, the proposed regression function is a consistent estimator of the flow map induced by the delay embedding. Furthermore, the corresponding Mori-Zwanzig equation governing the evolution of the observable simplifies to only a Markovian term, represented by the regression function. We realize this supervised learning task with a class of kernel-based linear estimators, the kernel analog forecast (KAF), which are consistent in the limit of large data. In a scenario with a high-dimensional covariate space, we employ a Markovian kernel smoothing method which is computationally cheaper than the Nystrom projection method for realizing KAF. In addition to the guaranteed theoretical convergence, we numerically demonstrate the effectiveness of this approach on higher-dimensional problems where the relevant kernel features are difficult to capture with the Nystrom method. Given noisy training data, we propose a nonparametric smoother as a de-noising method. Numerically, we show that the proposed smoother is more accurate than EnKF and 4Dvar in de-noising signals corrupted by independent (but not necessarily identically distributed) noise, even if the smoother is constructed using a data set corrupted by white noise. We show skillful prediction using the KAF constructed from the denoised data.
Differential Granger causality, that is understanding how Granger causal relations differ between two related time series, is of interest in many scientific applications. Modeling each time series by a vector autoregressive (VAR) model, we propose a new method to directly learn the difference between the corresponding transition matrices in high dimensions. Key to the new method is an estimating equation constructed based on the Yule-Walker equation that links the difference in transition matrices to the difference in the corresponding precision matrices. In contrast to separately estimating each transition matrix and then calculating the difference, the proposed direct estimation method only requires sparsity of the difference of the two VAR models, and hence allows hub nodes in each high-dimensional time series. The direct estimator is shown to be consistent in estimation and support recovery under mild assumptions. These results also lead to novel consistency results with potentially faster convergence rates for estimating differences between precision matrices of i.i.d observations under weaker assumptions than existing results. We evaluate the finite sample performance of the proposed method using simulation studies and an application to electroencephalogram (EEG) data.