No Arabic abstract
We consider the problem of identifying the causal direction between two discrete random variables using observational data. Unlike previous work, we keep the most general functional model but make an assumption on the unobserved exogenous variable: Inspired by Occams razor, we assume that the exogenous variable is simple in the true causal direction. We quantify simplicity using Renyi entropy. Our main result is that, under natural assumptions, if the exogenous variable has low $H_0$ entropy (cardinality) in the true direction, it must have high $H_0$ entropy in the wrong direction. We establish several algorithmic hardness results about estimating the minimum entropy exogenous variable. We show that the problem of finding the exogenous variable with minimum entropy is equivalent to the problem of finding minimum joint entropy given $n$ marginal distributions, also known as minimum entropy coupling problem. We propose an efficient greedy algorithm for the minimum entropy coupling problem, that for $n=2$ provably finds a local optimum. This gives a greedy algorithm for finding the exogenous variable with minimum $H_1$ (Shannon Entropy). Our greedy entropy-based causal inference algorithm has similar performance to the state of the art additive noise models in real datasets. One advantage of our approach is that we make no use of the values of random variables but only their distributions. Our method can therefore be used for causal inference for both ordinal and also categorical data, unlike additive noise models.
As quantum computing and networking nodes scale-up, important open questions arise on the causal influence of various sub-systems on the total system performance. These questions are related to the tomographic reconstruction of the macroscopic wavefunction and optimizing connectivity of large engineered qubit systems, the reliable broadcasting of information across quantum networks as well as speed-up of classical causal inference algorithms on quantum computers. A direct generalization of the existing causal inference techniques to the quantum domain is not possible due to superposition and entanglement. We put forth a new theoretical framework for merging quantum information science and causal inference by exploiting entropic principles. First, we build the fundamental connection between the celebrated quantum marginal problem and entropic causal inference. Second, inspired by the definition of geometric quantum discord, we fill the gap between classical conditional probabilities and quantum conditional density matrices. These fundamental theoretical advances are exploited to develop a scalable algorithmic approach for quantum entropic causal inference. We apply our proposed framework to an experimentally relevant scenario of identifying message senders on quantum noisy links. This successful inference on a synthetic quantum dataset can lay the foundations of identifying originators of malicious activity on future multi-node quantum networks. We unify classical and quantum causal inference in a principled way paving the way for future applications in quantum computing and networking.
We consider the problem of learning a causal graph over a set of variables with interventions. We study the cost-optimal causal graph learning problem: For a given skeleton (undirected version of the causal graph), design the set of interventions with minimum total cost, that can uniquely identify any causal graph with the given skeleton. We show that this problem is solvable in polynomial time. Later, we consider the case when the number of interventions is limited. For this case, we provide polynomial time algorithms when the skeleton is a tree or a clique tree. For a general chordal skeleton, we develop an efficient greedy algorithm, which can be improved when the causal graph skeleton is an interval graph.
The ultimate goal of cognitive neuroscience is to understand the mechanistic neural processes underlying the functional organization of the brain. Key to this study is understanding structure of both the structural and functional connectivity between anatomical regions. In this paper we follow previous work in developing a simple dynamical model of the brain by simulating its various regions as Kuramoto oscillators whose coupling structure is described by a complex network. However in our simulations rather than generating synthetic networks, we simulate our synthetic model but coupled by a real network of the anatomical brain regions which has been reconstructed from diffusion tensor imaging (DTI) data. By using an information theoretic approach that defines direct information flow in terms of causation entropy (CSE), we show that we can more accurately recover the true structural network than either of the popular correlation or LASSO regression techniques. We demonstrate the effectiveness of our method when applied to data simulated on the realistic DTI network, as well as on randomly generated small-world and Erdos-Renyi (ER) networks.
We introduce a concept to quantify the intrinsic causal contribution of each variable in a causal directed acyclic graph to the uncertainty or information of some target variable. By recursively writing each node as function of the noise terms, we separate the information added by each node from the one obtained from its ancestors. To interpret this information as a causal contribution, we consider structure-preserving interventions that randomize each node in a way that mimics the usual dependence on the parents and dont perturb the observed joint distribution. Using Shapley values, the contribution becomes independent of the ordering of nodes. We describe our contribution analysis for variance and entropy as two important examples, but contributions for other target metrics can be defined analogously.
In this paper we treat both forms of probabilistic inference, estimating marginal probabilities of the joint distribution and finding the most probable assignment, through a unified message-passing algorithm architecture. We generalize the Belief Propagation (BP) algorithms of sum-product and max-product and tree-rewaighted (TRW) sum and max product algorithms (TRBP) and introduce a new set of convergent algorithms based on convex-free-energy and Linear-Programming (LP) relaxation as a zero-temprature of a convex-free-energy. The main idea of this work arises from taking a general perspective on the existing BP and TRBP algorithms while observing that they all are reductions from the basic optimization formula of $f + sum_i h_i$ where the function $f$ is an extended-valued, strictly convex but non-smooth and the functions $h_i$ are extended-valued functions (not necessarily convex). We use tools from convex duality to present the primal-dual ascent algorithm which is an extension of the Bregman successive projection scheme and is designed to handle optimization of the general type $f + sum_i h_i$. Mapping the fractional-free-energy variational principle to this framework introduces the norm-product message-passing. Special cases include sum-product and max-product (BP algorithms) and the TRBP algorithms. When the fractional-free-energy is set to be convex (convex-free-energy) the norm-product is globally convergent for estimating of marginal probabilities and for approximating the LP-relaxation. We also introduce another branch of the norm-product, the convex-max-product. The convex-max-product is convergent (unlike max-product) and aims at solving the LP-relaxation.