No Arabic abstract
We congratulate Engelke and Hitz on a thought-provoking paper on graphical models for extremes. A key contribution of the paper is the introduction of a novel definition of conditional independence for a multivariate Pareto distribution. Here, we outline a proposal for independence and conditional independence of general random variables whose support is a general set Omega in multidimensional real number space. Our proposal includes the authors definition of conditional independence, and the analogous definition of independence as special cases. By making our proposal independent of the context of extreme value theory, we highlight the importance of the authors contribution beyond this particular context.
We consider the problem of conditional independence testing of $X$ and $Y$ given $Z$ where $X,Y$ and $Z$ are three real random variables and $Z$ is continuous. We focus on two main cases - when $X$ and $Y$ are both discrete, and when $X$ and $Y$ are both continuous. In view of recent results on conditional independence testing (Shah and Peters, 2018), one cannot hope to design non-trivial tests, which control the type I error for all absolutely continuous conditionally independent distributions, while still ensuring power against interesting alternatives. Consequently, we identify various, natural smoothness assumptions on the conditional distributions of $X,Y|Z=z$ as $z$ varies in the support of $Z$, and study the hardness of conditional independence testing under these smoothness assumptions. We derive matching lower and upper bounds on the critical radius of separation between the null and alternative hypotheses in the total variation metric. The tests we consider are easily implementable and rely on binning the support of the continuous variable $Z$. To complement these results, we provide a new proof of the hardness result of Shah and Peters.
Measuring conditional independence is one of the important tasks in statistical inference and is fundamental in causal discovery, feature selection, dimensionality reduction, Bayesian network learning, and others. In this work, we explore the connection between conditional independence measures induced by distances on a metric space and reproducing kernels associated with a reproducing kernel Hilbert space (RKHS). For certain distance and kernel pairs, we show the distance-based conditional independence measures to be equivalent to that of kernel-based measures. On the other hand, we also show that some popular---in machine learning---kernel conditional independence measures based on the Hilbert-Schmidt norm of a certain cross-conditional covariance operator, do not have a simple distance representation, except in some limiting cases. This paper, therefore, shows the distance and kernel measures of conditional independence to be not quite equivalent unlike in the case of joint independence as shown by Sejdinovic et al. (2013).
This chapter of the forthcoming Handbook of Graphical Models contains an overview of basic theorems and techniques from algebraic geometry and how they can be applied to the study of conditional independence and graphical models. It also introduces binomial ideals and some ideas from real algebraic geometry. When random variables are discrete or Gaussian, tools from computational algebraic geometry can be used to understand implications between conditional independence statements. This is accomplished by computing primary decompositions of conditional independence ideals. As examples the chapter presents in detail the graphical model of a four cycle and the intersection axiom, a certain implication of conditional independence statements. Another important problem in the area is to determine all constraints on a graphical model, for example, equations determined by trek separation. The full set of equality constraints can be determined by computing the models vanishing ideal. The chapter illustrates these techniques and ideas with examples from the literature and provides references for further reading.
Lattice Conditional Independence models are a class of models developed first for the Gaussian case in which a distributive lattice classifies all the conditional independence statements. The main result is that these models can equivalently be described via a transitive acyclic graph (TDAG) in which, as is normal for causal models, the conditional independence is in terms of conditioning on ancestors in the graph. We aim to demonstrate that a parallel stream of research in algebra, the theory of Hibi ideals, not only maps directly to the LCI models but gives a vehicle to generalise the theory from the linear Gaussian case. Given a distributive lattice (i) each conditional independence statement is associated with a Hibi relation defined on the lattice, (ii) the directed graph is given by chains in the lattice which correspond to chains of conditional independence, (iii) the elimination ideal of product terms in the chains gives the Hibi ideal and (iv) the TDAG can be recovered from a special bipartite graph constructed via the Alexander dual of the Hibi ideal. It is briefly demonstrated that there are natural applications to statistical log-linear models, time series, and Shannon information flow.
Discussion of ``Least angle regression by Efron et al. [math.ST/0406456]