No Arabic abstract
Local graph clustering methods aim to find small clusters in very large graphs. These methods take as input a graph and a seed node, and they return as output a good cluster in a running time that depends on the size of the output cluster but that is independent of the size of the input graph. In this paper, we adopt a statistical perspective on local graph clustering, and we analyze the performance of the l1-regularized PageRank method~(Fountoulakis et. al.) for the recovery of a single target cluster, given a seed node inside the cluster. Assuming the target cluster has been generated by a random model, we present two results. In the first, we show that the optimal support of l1-regularized PageRank recovers the full target cluster, with bounded false positives. In the second, we show that if the seed node is connected solely to the target cluster then the optimal support of l1-regularized PageRank recovers exactly the target cluster. We also show empirically that l1-regularized PageRank has a state-of-the-art performance on many real graphs, demonstrating the superiority of the method. From a computational perspective, we show that the solution path of l1-regularized PageRank is monotonic. This allows for the application of the forward stagewise algorithm, which approximates the solution path in running time that does not depend on the size of the whole graph. Finally, we show that l1-regularized PageRank and approximate personalized PageRank (APPR), another very popular method for local graph clustering, are equivalent in the sense that we can lower and upper bound the output of one with the output of the other. Based on this relation, we establish for APPR similar results to those we establish for l1-regularized PageRank.
We provide statistical learning guarantees for two unsupervised learning tasks in the context of compressive statistical learning, a general framework for resource-efficient large-scale learning that we introduced in a companion paper.The principle of compressive statistical learning is to compress a training collection, in one pass, into a low-dimensional sketch (a vector of random empirical generalized moments) that captures the information relevant to the considered learning task. We explicitly describe and analyze random feature functions which empirical averages preserve the needed information for compressive clustering and compressive Gaussian mixture modeling with fixed known variance, and establish sufficient sketch sizes given the problem dimensions.
Local graph clustering and the closely related seed set expansion problem are primitives on graphs that are central to a wide range of analytic and learning tasks such as local clustering, community detection, nodes ranking and feature inference. Prior work on local graph clustering mostly falls into two categories with numerical and combinatorial roots respectively. In this work, we draw inspiration from both fields and propose a family of convex optimization formulations based on the idea of diffusion with p-norm network flow for $pin (1,infty)$. In the context of local clustering, we characterize the optimal solutions for these optimization problems and show their usefulness in finding low conductance cuts around input seed set. In particular, we achieve quadratic approximation of conductance in the case of $p=2$ similar to the Cheeger-type bounds of spectral methods, constant factor approximation when $prightarrowinfty$ similar to max-flow based methods, and a smooth transition for general $p$ values in between. Thus, our optimization formulation can be viewed as bridging the numerical and combinatorial approaches, and we can achieve the best of both worlds in terms of speed and noise robustness. We show that the proposed problem can be solved in strongly local running time for $pge 2$ and conduct empirical evaluations on both synthetic and real-world graphs to illustrate our approach compares favorably with existing methods.
In this paper, we investigate the decentralized statistical inference problem, where a network of agents cooperatively recover a (structured) vector from private noisy samples without centralized coordination. Existing optimization-based algorithms suffer from issues of model mismatch and poor convergence speed, and thus their performance would be degraded, provided that the number of communication rounds is limited. This motivates us to propose a learning-based framework, which unrolls well-noted decentralized optimization algorithms (e.g., Prox-DGD and PG-EXTRA) into graph neural networks (GNNs). By minimizing the recovery error via end-to-end training, this learning-based framework resolves the model mismatch issue. Our convergence analysis (with PG-EXTRA as the base algorithm) reveals that the learned model parameters may accelerate the convergence and reduce the recovery error to a large extent. The simulation results demonstrate that the proposed GNN-based learning methods prominently outperform several state-of-the-art optimization-based algorithms in convergence speed and recovery error.
We introduce a probabilistic robustness measure for Bayesian Neural Networks (BNNs), defined as the probability that, given a test point, there exists a point within a bounded set such that the BNN prediction differs between the two. Such a measure can be used, for instance, to quantify the probability of the existence of adversarial examples. Building on statistical verification techniques for probabilistic models, we develop a framework that allows us to estimate probabilistic robustness for a BNN with statistical guarantees, i.e., with a priori error and confidence bounds. We provide experimental comparison for several approximate BNN inference techniques on image classification tasks associated to MNIST and a two-class subset of the GTSRB dataset. Our results enable quantification of uncertainty of BNN predictions in adversarial settings.
Generative Adversarial Networks (GANs) have achieved great success in unsupervised learning. Despite the remarkable empirical performance, there are limited theoretical understandings on the statistical properties of GANs. This paper provides statistical guarantees of GANs for the estimation of data distributions which have densities in a H{o}lder space. Our main result shows that, if the generator and discriminator network architectures are properly chosen (universally for all distributions with H{o}lder densities), GANs are consistent estimators of the data distributions under strong discrepancy metrics, such as the Wasserstein distance. To our best knowledge, this is the first statistical theory of GANs for H{o}lder densities. In comparison with existing works, our theory requires minimum assumptions on data distributions. Our generator and discriminator networks utilize general weight matrices and the non-invertible ReLU activation function, while many existing works only apply to invertible weight matrices and invertible activation functions. In our analysis, we decompose the error into a statistical error and an approximation error by a new oracle inequality, which may be of independent interest.