No Arabic abstract
We study the problem of discovering the simplest latent variable that can make two observed discrete variables conditionally independent. The minimum entropy required for such a latent is known as common entropy in information theory. We extend this notion to Renyi common entropy by minimizing the Renyi entropy of the latent variable. To efficiently compute common entropy, we propose an iterative algorithm that can be used to discover the trade-off between the entropy of the latent variable and the conditional mutual information of the observed variables. We show two applications of common entropy in causal inference: First, under the assumption that there are no low-entropy mediators, it can be used to distinguish causation from spurious correlation among almost all joint distributions on simple causal graphs with two observed variables. Second, common entropy can be used to improve constraint-based methods such as PC or FCI algorithms in the small-sample regime, where these methods are known to struggle. We propose a modification to these constraint-based methods to assess if a separating set found by these algorithms is valid using common entropy. We finally evaluate our algorithms on synthetic and real data to establish their performance.
Quantum causality is an emerging field of study which has the potential to greatly advance our understanding of quantum systems. One of the most important problems in quantum causality is linked to this prominent aphorism that states correlation does not mean causation. A direct generalization of the existing causal inference techniques to the quantum domain is not possible due to superposition and entanglement. We put forth a new theoretical framework for merging quantum information science and causal inference by exploiting entropic principles. For this purpose, we leverage the concept of conditional density matrices to develop a scalable algorithmic approach for inferring causality in the presence of latent confounders (common causes) in quantum systems. We apply our proposed framework to an experimentally relevant scenario of identifying message senders on quantum noisy links, where it is validated that the input before noise as a latent confounder is the cause of the noisy outputs. We also demonstrate that the proposed approach outperforms the results of classical causal inference even when the variables are classical by exploiting quantum dependence between variables through density matrices rather than joint probability distributions. Thus, the proposed approach unifies classical and quantum causal inference in a principled way. This successful inference on a synthetic quantum dataset can lay the foundations of identifying originators of malicious activity on future multi-node quantum networks.
As quantum computing and networking nodes scale-up, important open questions arise on the causal influence of various sub-systems on the total system performance. These questions are related to the tomographic reconstruction of the macroscopic wavefunction and optimizing connectivity of large engineered qubit systems, the reliable broadcasting of information across quantum networks as well as speed-up of classical causal inference algorithms on quantum computers. A direct generalization of the existing causal inference techniques to the quantum domain is not possible due to superposition and entanglement. We put forth a new theoretical framework for merging quantum information science and causal inference by exploiting entropic principles. First, we build the fundamental connection between the celebrated quantum marginal problem and entropic causal inference. Second, inspired by the definition of geometric quantum discord, we fill the gap between classical conditional probabilities and quantum conditional density matrices. These fundamental theoretical advances are exploited to develop a scalable algorithmic approach for quantum entropic causal inference. We apply our proposed framework to an experimentally relevant scenario of identifying message senders on quantum noisy links. This successful inference on a synthetic quantum dataset can lay the foundations of identifying originators of malicious activity on future multi-node quantum networks. We unify classical and quantum causal inference in a principled way paving the way for future applications in quantum computing and networking.
The ultimate goal of cognitive neuroscience is to understand the mechanistic neural processes underlying the functional organization of the brain. Key to this study is understanding structure of both the structural and functional connectivity between anatomical regions. In this paper we follow previous work in developing a simple dynamical model of the brain by simulating its various regions as Kuramoto oscillators whose coupling structure is described by a complex network. However in our simulations rather than generating synthetic networks, we simulate our synthetic model but coupled by a real network of the anatomical brain regions which has been reconstructed from diffusion tensor imaging (DTI) data. By using an information theoretic approach that defines direct information flow in terms of causation entropy (CSE), we show that we can more accurately recover the true structural network than either of the popular correlation or LASSO regression techniques. We demonstrate the effectiveness of our method when applied to data simulated on the realistic DTI network, as well as on randomly generated small-world and Erdos-Renyi (ER) networks.
Constraint-based causal discovery from limited data is a notoriously difficult challenge due to the many borderline independence test decisions. Several approaches to improve the reliability of the predictions by exploiting redundancy in the independence information have been proposed recently. Though promising, existing approaches can still be greatly improved in terms of accuracy and scalability. We present a novel method that reduces the combinatorial explosion of the search space by using a more coarse-grained representation of causal information, drastically reducing computation time. Additionally, we propose a method to score causal predictions based on their confidence. Crucially, our implementation also allows one to easily combine observational and interventional data and to incorporate various types of available background knowledge. We prove soundness and asymptotic consistency of our method and demonstrate that it can outperform the state-of-the-art on synthetic data, achieving a speedup of several orders of magnitude. We illustrate its practical feasibility by applying it on a challenging protein data set.
While machine learning (ML) methods have received a lot of attention in recent years, these methods are primarily for prediction. Empirical researchers conducting policy evaluations are, on the other hand, pre-occupied with causal problems, trying to answer counterfactual questions: what would have happened in the absence of a policy? Because these counterfactuals can never be directly observed (described as the fundamental problem of causal inference) prediction tools from the ML literature cannot be readily used for causal inference. In the last decade, major innovations have taken place incorporating supervised ML tools into estimators for causal parameters such as the average treatment effect (ATE). This holds the promise of attenuating model misspecification issues, and increasing of transparency in model selection. One particularly mature strand of the literature include approaches that incorporate supervised ML approaches in the estimation of the ATE of a binary treatment, under the textit{unconfoundedness} and positivity assumptions (also known as exchangeability and overlap assumptions). This article reviews popular supervised machine learning algorithms, including the Super Learner. Then, some specific uses of machine learning for treatment effect estimation are introduced and illustrated, namely (1) to create balance among treated and control groups, (2) to estimate so-called nuisance models (e.g. the propensity score, or conditional expectations of the outcome) in semi-parametric estimators that target causal parameters (e.g. targeted maximum likelihood estimation or the double ML estimator), and (3) the use of machine learning for variable selection in situations with a high number of covariates.