No Arabic abstract
Causal inference is perhaps one of the most fundamental concepts in science, beginning originally from the works of some of the ancient philosophers, through today, but also weaved strongly in current work from statisticians, machine learning experts, and scientists from many other fields. This paper takes the perspective of information flow, which includes the Nobel prize winning work on Granger-causality, and the recently highly popular transfer entropy, these being probabilistic in nature. Our main contribution will be to develop analysis tools that will allow a geometric interpretation of information flow as a causal inference indicated by positive transfer entropy. We will describe the effective dimensionality of an underlying manifold as projected into the outcome space that summarizes information flow. Therefore contrasting the probabilistic and geometric perspectives, we will introduce a new measure of causal inference based on the fractal correlation dimension conditionally applied to competing explanations of future forecasts, which we will write $GeoC_{yrightarrow x}$. This avoids some of the boundedness issues that we show exist for the transfer entropy, $T_{yrightarrow x}$. We will highlight our discussions with data developed from synthetic models of successively more complex nature: then include the H{e}non map example, and finally a real physiological example relating breathing and heart rate function. Keywords: Causal Inference; Transfer Entropy; Differential Entropy; Correlation Dimension; Pinskers Inequality; Frobenius-Perron operator.
Understanding and even defining what constitutes animal interactions remains a challenging problem. Correlational tools may be inappropriate for detecting communication between a set of many agents exhibiting nonlinear behavior. A different approach is to define coordinated motions in terms of an information theoretic channel of direct causal information flow. In this work, we consider time series data obtained by an experimental protocol of optical tracking of the insect species Chironomus riparius. The data constitute reconstructed 3-D spatial trajectories of the insects flight trajectories and kinematics. We present an application of the optimal causation entropy (oCSE) principle to identify direct causal relationships or information channels among the insects. The collection of channels inferred by oCSE describes a network of information flow within the swarm. We find that information channels with a long spatial range are more common than expected under the assumption that causal information flows should be spatially localized. The tools developed herein are general and applicable to the inference and study of intercommunication networks in a wide variety of natural settings.
Recovery of the causal structure of dynamic networks from noisy measurements has long been a problem of intense interest across many areas of science and engineering. Many algorithms have been proposed, but there is no work that compares the performance of the algorithms to converse bounds in a non-asymptotic setting. As a step to address this problem, this paper gives lower bounds on the error probability for causal network support recovery in a linear Gaussian setting. The bounds are based on the use of the Bhattacharyya coefficient for binary hypothesis testing problems with mixture probability distributions. Comparison of the bounds and the performance achieved by two representative recovery algorithms are given for sparse random networks based on the ErdH{o}s-Renyi model.
Heterogeneity in medical data, e.g., from data collected at different sites and with different protocols in a clinical study, is a fundamental hurdle for accurate prediction using machine learning models, as such models often fail to generalize well. This paper leverages a recently proposed normalizing-flow-based method to perform counterfactual inference upon a structural causal model (SCM), in order to achieve harmonization of such data. A causal model is used to model observed effects (brain magnetic resonance imaging data) that result from known confounders (site, gender and age) and exogenous noise variables. Our formulation exploits the bijection induced by flow for the purpose of harmonization. We infer the posterior of exogenous variables, intervene on observations, and draw samples from the resultant SCM to obtain counterfactuals. This approach is evaluated extensively on multiple, large, real-world medical datasets and displayed better cross-domain generalization compared to state-of-the-art algorithms. Further experiments that evaluate the quality of confounder-independent data generated by our model using regression and classification tasks are provided.
Making predictions in a robust way is not easy for nonlinear systems. In this work, a neural network computing framework, i.e., a spatiotemporal convolutional network (STCN), was developed to efficiently and accurately render a multistep-ahead prediction of a time series by employing a spatial-temporal information (STI) transformation. The STCN combines the advantages of both the temporal convolutional network (TCN) and the STI equation, which maps the high-dimensional/spatial data to the future temporal values of a target variable, thus naturally providing the prediction of the target variable. From the observed variables, the STCN also infers the causal factors of the target variable in the sense of Granger causality, which are in turn selected as effective spatial information to improve the prediction robustness. The STCN was successfully applied to both benchmark systems and real-world datasets, all of which show superior and robust performance in multistep-ahead prediction, even when the data were perturbed by noise. From both theoretical and computational viewpoints, the STCN has great potential in practical applications in artificial intelligence (AI) or machine learning fields as a model-free method based only on the observed data, and also opens a new way to explore the observed high-dimensional data in a dynamical manner for machine learning.
Inferring the potential consequences of an unobserved event is a fundamental scientific question. To this end, Pearls celebrated do-calculus provides a set of inference rules to derive an interventional probability from an observational one. In this framework, the primitive causal relations are encoded as functional dependencies in a Structural Causal Model (SCM), which are generally mapped into a Directed Acyclic Graph (DAG) in the absence of cycles. In this paper, by contrast, we capture causality without reference to graphs or functional dependencies, but with information fields and Witsenhausens intrinsic model. The three rules of do-calculus reduce to a unique sufficient condition for conditional independence, the topological separation, which presents interesting theoretical and practical advantages over the d-separation. With this unique rule, we can deal with systems that cannot be represented with DAGs, for instance systems with cycles and/or spurious edges. We treat an example that cannot be handled-to the extent of our knowledge-with the tools of the current literature. We also explain why, in the presence of cycles, the theory of causal inference might require different tools, depending on whether the random variables are discrete or continuous.