Model-free consistency of graph partitioning


Abstract in English

In this paper, we exploit the theory of dense graph limits to provide a new framework to study the stability of graph partitioning methods, which we call structural consistency. Both stability under perturbation as well as asymptotic consistency (i.e., convergence with probability $1$ as the sample size goes to infinity under a fixed probability model) follow from our notion of structural consistency. By formulating structural consistency as a continuity result on the graphon space, we obtain robust results that are completely independent of the data generating mechanism. In particular, our results apply in settings where observations are not independent, thereby significantly generalizing the common probabilistic approach where data are assumed to be i.i.d. In order to make precise the notion of structural consistency of graph partitioning, we begin by extending the theory of graph limits to include vertex colored graphons. We then define continuous node-level statistics and prove that graph partitioning based on such statistics is consistent. Finally, we derive the structural consistency of commonly used clustering algorithms in a general model-free setting. These include clustering based on local graph statistics such as homomorphism densities, as well as the popular spectral clustering using the normalized Laplacian. We posit that proving the continuity of clustering algorithms in the graph limit topology can stand on its own as a more robust form of model-free consistency. We also believe that the mathematical framework developed in this paper goes beyond the study of clustering algorithms, and will guide the development of similar model-free frameworks to analyze other procedures in the broader mathematical sciences.

Download