Fundamental Limits of Deep Graph Convolutional Networks

77 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Abram Magner

تاريخ النشر 2019

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف Abram Magner - Mayank Baranwal - Alfred O. Hero III

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Graph convolutional networks (GCNs) are a widely used method for graph representation learning. To elucidate the capabilities and limitations of GCNs, we investigate their power, as a function of their number of layers, to distinguish between different random graph models (corresponding to different class-conditional distributions in a classification problem) on the basis of the embeddings of their sample graphs. In particular, the graph models that we consider arise from graphons, which are the most general possible parameterizations of infinite exchangeable graph models and which are the central objects of study in the theory of dense graph limits. We give a precise characterization of the set of pairs of graphons that are indistinguishable by a GCN with nonlinear activation functions coming from a certain broad class if its depth is at least logarithmic in the size of the sample graph. This characterization is in terms of a degree profile closeness property. Outside this class, a very simple GCN architecture suffices for distinguishability. We then exhibit a concrete, infinite class of graphons arising from stochastic block models that are well-separated in terms of cut distance and are indistinguishable by a GCN. These results theoretically match empirical observations of several prior works. To prove our results, we exploit a connection to random walks on graphs. Finally, we give empirical results on synthetic and real graph classification datasets, indicating that indistinguishable graph distributions arise in practice.

قيم البحث

218 - Abram Magner , Mayank Baranwal , Alfred O. Hero III 2020

Graph convolutional networks (GCNs) are a widely used method for graph representation learning. We investigate the power of GCNs, as a function of their number of layers, to distinguish between different random graph models on the basis of the embedd ings of their sample graphs. In particular, the graph models that we consider arise from graphons, which are the most general possible parameterizations of infinite exchangeable graph models and which are the central objects of study in the theory of dense graph limits. We exhibit an infinite class of graphons that are well-separated in terms of cut distance and are indistinguishable by a GCN with nonlinear activation functions coming from a certain broad class if its depth is at least logarithmic in the size of the sample graph. These results theoretically match empirical observations of several prior works. Finally, we show a converse result that for pairs of graphons satisfying a degree profile separation property, a very simple GCN architecture suffices for distinguishability. To prove our results, we exploit a connection to random walks on graphs.

التعلم الالي نظرية المعلومات التعلم الآلي

Deep Convolutional Neural Networks for Massive MIMO Fingerprint-Based Positioning

62 - Joao Vieira , Erik Leitinger , Muris Sarajlic 2017

This paper provides an initial investigation on the application of convolutional neural networks (CNNs) for fingerprint-based positioning using measured massive MIMO channels. When represented in appropriate domains, massive MIMO channels have a spar se structure which can be efficiently learned by CNNs for positioning purposes. We evaluate the positioning accuracy of state-of-the-art CNNs with channel fingerprints generated from a channel model with a rich clustered structure: the COST 2100 channel model. We find that moderately deep CNNs can achieve fractional-wavelength positioning accuracies, provided that an enough representative data set is available for training.

التعلم الالي نظرية المعلومات نظرية المعلومات

Fundamental Limits of Weak Recovery with Applications to Phase Retrieval

77 - Marco Mondelli , Andrea Montanari 2017

In phase retrieval we want to recover an unknown signal $boldsymbol xinmathbb C^d$ from $n$ quadratic measurements of the form $y_i = |langle{boldsymbol a}_i,{boldsymbol x}rangle|^2+w_i$ where $boldsymbol a_iin mathbb C^d$ are known sensing vectors a nd $w_i$ is measurement noise. We ask the following weak recovery question: what is the minimum number of measurements $n$ needed to produce an estimator $hat{boldsymbol x}(boldsymbol y)$ that is positively correlated with the signal $boldsymbol x$? We consider the case of Gaussian vectors $boldsymbol a_i$. We prove that - in the high-dimensional limit - a sharp phase transition takes place, and we locate the threshold in the regime of vanishingly small noise. For $nle d-o(d)$ no estimator can do significantly better than random and achieve a strictly positive correlation. For $nge d+o(d)$ a simple spectral estimator achieves a positive correlation. Surprisingly, numerical simulations with the same spectral estimator demonstrate promising performance with realistic sensing matrices. Spectral methods are used to initialize non-convex optimization algorithms in phase retrieval, and our approach can boost the performance in this setting as well. Our impossibility result is based on classical information-theory arguments. The spectral algorithm computes the leading eigenvector of a weighted empirical covariance matrix. We obtain a sharp characterization of the spectral properties of this random matrix using tools from free probability and generalizing a recent result by Lu and Li. Both the upper and lower bound generalize beyond phase retrieval to measurements $y_i$ produced according to a generalized linear model. As a byproduct of our analysis, we compare the threshold of the proposed spectral method with that of a message passing algorithm.

التعلم الالي نظرية المعلومات نظرية المعلومات

Revisiting Graph Neural Networks: All We Have is Low-Pass Filters

306 - Hoang NT , Takanori Maehara 2019

Graph neural networks have become one of the most important techniques to solve machine learning problems on graph-structured data. Recent work on vertex classification proposed deep and distributed learning models to achieve high performance and sca lability. However, we find that the feature vectors of benchmark datasets are already quite informative for the classification task, and the graph structure only provides a means to denoise the data. In this paper, we develop a theoretical framework based on graph signal processing for analyzing graph neural networks. Our results indicate that graph neural networks only perform low-pass filtering on feature vectors and do not have the non-linear manifold learning property. We further investigate their resilience to feature noise and propose some insights on GCN-based graph neural network design.

التعلم الالي نظرية المعلومات التعلم الآلي

Convolutional Kernel Networks for Graph-Structured Data

202 - Dexiong Chen , Laurent Jacob , Julien Mairal 2020

We introduce a family of multilayer graph kernels and establish new links between graph convolutional neural networks and kernel methods. Our approach generalizes convolutional kernel networks to graph-structured data, by representing graphs as a seq uence of kernel feature maps, where each node carries information about local graph substructures. On the one hand, the kernel point of view offers an unsupervised, expressive, and easy-to-regularize data representation, which is useful when limited samples are available. On the other hand, our model can also be trained end-to-end on large-scale data, leading to new types of graph convolutional neural networks. We show that our method achieves competitive performance on several graph classification benchmarks, while offering simple model interpretation. Our code is freely available at https://github.com/claying/GCKN.

التعلم الالي التعلم الآلي