Graph Independence Testing


الملخص بالإنكليزية

Identifying statistically significant dependency between variables is a key step in scientific discoveries. Many recent methods, such as distance and kernel tests, have been proposed for valid and consistent independence testing and can be applied to data in Euclidean and non-Euclidean spaces. However, in those works, $n$ pairs of points in $mathcal{X} times mathcal{Y}$ are observed. Here, we consider the setting where a pair of $n times n$ graphs are observed, and the corresponding adjacency matrices are treated as kernel matrices. Under a $rho$-correlated stochastic block model, we demonstrate that a naive test (permutation and Pearsons) for a conditional dependency graph model is invalid. Instead, we propose a block-permutation procedure. We prove that our procedure is valid and consistent -- even when the two graphs have different marginal distributions, are weighted or unweighted, and the latent vertex assignments are unknown -- and provide sufficient conditions for the tests to estimate $rho$. Simulations corroborate these results on both binary and weighted graphs. Applying these tests to the whole-organism, single-cell-resolution structural connectomes of C. elegans, we identify strong statistical dependency between the chemical synapse connectome and the gap junction connectome.

تحميل البحث