No Arabic abstract
Building on the theory of causal discovery from observational data, we study interactions between multiple (sets of) random variables in a linear structural equation model with non-Gaussian error terms. We give a correspondence between structure in the higher order cumulants and combinatorial structure in the causal graph. It has previously been shown that low rank of the covariance matrix corresponds to trek separation in the graph. Generalizing this criterion to multiple sets of vertices, we characterize when determinants of subtensors of the higher order cumulant tensors vanish. This criterion applies when hidden variables are present as well. For instance, it allows us to identify the presence of a hidden common cause of k of the observed variables.
Directed graphical models specify noisy functional relationships among a collection of random variables. In the Gaussian case, each such model corresponds to a semi-algebraic set of positive definite covariance matrices. The set is given via parametrization, and much work has gone into obtaining an implicit description in terms of polynomial (in-)equalities. Implicit descriptions shed light on problems such as parameter identification, model equivalence, and constraint-based statistical inference. For models given by directed acyclic graphs, which represent settings where all relevant variables are observed, there is a complete theory: All conditional independence relations can be found via graphical $d$-separation and are sufficient for an implicit description. The situation is far more complicated, however, when some of the variables are hidden (or in other words, unobserved or latent). We consider models associated to mixed graphs that capture the effects of hidden variables through correlated error terms. The notion of trek separation explains when the covariance matrix in such a model has submatrices of low rank and generalizes $d$-separation. However, in many cases, such as the infamous Verma graph, the polynomials defining the graphical model are not determinantal, and hence cannot be explained by $d$-separation or trek-separation. In this paper, we show that these constraints often correspond to the vanishing of nested determinants and can be graphically explained by a notion of restricted trek separation.
We consider a bivariate time series $(X_t,Y_t)$ that is given by a simple linear autoregressive model. Assuming that the equations describing each variable as a linear combination of past values are considered structural equations, there is a clear meaning of how intervening on one particular $X_t$ influences $Y_{t}$ at later times $t>t$. In the present work, we describe conditions under which one can define a causal model between variables that are coarse-grained in time, thus admitting statements like `setting $X$ to $x$ changes $Y$ in a certain way without referring to specific time instances. We show that particularly simple statements follow in the frequency domain, thus providing meaning to interventions on frequencies.
For some variants of regression models, including partial, measurement error or error-in-variables, latent effects, semi-parametric and otherwise corrupted linear models, the classical parametric tests generally do not perform well. Various modifications and generalizations considered extensively in the literature rests on stringent regularity assumptions which are not likely to be tenable in many applications. However, in such non-standard cases, rank based tests can be adapted better, and further, incorporation of rank analysis of covariance tools enhance their power-efficiency. Numerical studies and a real data illustration show the superiority of rank based inference in such corrupted linear models.
In this paper, we explore a connection between binary hierarchical models, their marginal polytopes and codeword polytopes, the convex hulls of linear codes. The class of linear codes that are realizable by hierarchical models is determined. We classify all full dimensional polytopes with the property that their vertices form a linear code and give an algorithm that determines them.
For a nonlinear regression model the information matrices of designs depend on the parameter of the model. The adaptive Wynn-algorithm for D-optimal design estimates the parameter at each step on the basis of the employed design points and observed responses so far, and selects the next design point as in the classical Wynn-algorithm for D-optimal design. The name `Wynn-algorithm is in honor of Henry P. Wynn who established the latter `classical algorithm in his 1970 paper. The asymptotics of the sequences of designs and maximum likelihood estimates generated by the adaptive algorithm is studied for an important class of nonlinear regression models: generalized linear models whose (univariate) response variables follow a distribution from a one-parameter exponential family. Under the assumptions of compactness of the experimental region and of the parameter space together with some natural continuity assumptions it is shown that the adaptive ML-estimators are strongly consistent and the design sequence is asymptotically locally D-optimal at the true parameter point. If the true parameter point is an interior point of the parameter space then under some smoothness assumptions the asymptotic normality of the adaptive ML-estimators is obtained.