From Node Embedding To Community Embedding : A Hyperbolic Approach

326 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Hatem Hajri

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية الاحصاء الرياضي

والبحث باللغة English

تأليف Thomas Gerald - Hadi Zaatiti - Hatem Hajri

التعلم الآلي التعلم الالي

قم بزيارة صفحتنا على فيسبوك

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Detecting communities on graphs has received significant interest in recent literature. Current state-of-the-art community embedding approach called textit{ComE} tackles this problem by coupling graph embedding with community detection. Considering the success of hyperbolic representations of graph-structured data in last years, an ongoing challenge is to set up a hyperbolic approach for the community detection problem. The present paper meets this challenge by introducing a Riemannian equivalent of textit{ComE}. Our proposed approach combines hyperbolic embeddings with Riemannian K-means or Riemannian mixture models to perform community detection. We illustrate the usefulness of this framework through several experiments on real-world social networks and comparisons with textit{ComE} and recent hyperbolic-based classification approaches.

قيم البحث

119 - Lingfei Wu , Ian En-Hsu Yen , Zhen Zhang 2019

Graph kernels are widely used for measuring the similarity between graphs. Many existing graph kernels, which focus on local patterns within graphs rather than their global properties, suffer from significant structure information loss when represent ing graphs. Some recent global graph kernels, which utilizes the alignment of geometric node embeddings of graphs, yield state-of-the-art performance. However, these graph kernels are not necessarily positive-definite. More importantly, computing the graph kernel matrix will have at least quadratic {time} complexity in terms of the number and the size of the graphs. In this paper, we propose a new family of global alignment graph kernels, which take into account the global properties of graphs by using geometric node embeddings and an associated node transportation based on earth movers distance. Compared to existing global kernels, the proposed kernel is positive-definite. Our graph kernel is obtained by defining a distribution over emph{random graphs}, which can naturally yield random feature approximations. The random feature approximations lead to our graph embeddings, which is named as random graph embeddings (RGE). In particular, RGE is shown to achieve emph{(quasi-)linear scalability} with respect to the number and the size of the graphs. The experimental results on nine benchmark datasets demonstrate that RGE outperforms or matches twelve state-of-the-art graph classification algorithms.

التعلم الآلي التعلم الالي

Node Embedding via Word Embedding for Network Community Discovery

399 - Weicong Ding , Christy Lin , Prakash Ishwar 2016

Neural node embeddings have recently emerged as a powerful representation for supervised learning tasks involving graph-structured data. We leverage this recent advance to develop a novel algorithm for unsupervised community discovery in graphs. Thro ugh extensive experimental studies on simulated and real-world data, we demonstrate that the proposed approach consistently improves over the current state-of-the-art. Specifically, our approach empirically attains the information-theoretic limits for community recovery under the benchmark Stochastic Block Models for graph generation and exhibits better stability and accuracy over both Spectral Clustering and Acyclic Belief Propagation in the community recovery limits.

الشبكات الاجتماعية والمعلومات الفيزياء والمجتمع التعلم الالي

Multilayer Graph Clustering with Optimized Node Embedding

95 - Mireille El Gheche , Pascal Frossard 2021

We are interested in multilayer graph clustering, which aims at dividing the graph nodes into categories or communities. To do so, we propose to learn a clustering-friendly embedding of the graph nodes by solving an optimization problem that involves a fidelity term to the layers of a given multilayer graph, and a regularization on the (single-layer) graph induced by the embedding. The fidelity term uses the contrastive loss to properly aggregate the observed layers into a representative embedding. The regularization pushes for a sparse and community-aware graph, and it is based on a measure of graph sparsification called effective resistance, coupled with a penalization of the first few eigenvalues of the representative graph Laplacian matrix to favor the formation of communities. The proposed optimization problem is nonconvex but fully differentiable, and thus can be solved via the descent gradient method. Experiments show that our method leads to a significant improvement w.r.t. state-of-the-art multilayer graph clustering algorithms.

التعلم الآلي الذكاء الاصطناعي

Data Poisoning Attack against Unsupervised Node Embedding Methods

304 - Mingjie Sun , Jian Tang , Huichen Li 2018

Unsupervised node embedding methods (e.g., DeepWalk, LINE, and node2vec) have attracted growing interests given their simplicity and effectiveness. However, although these methods have been proved effective in a variety of applications, none of the e xisting work has analyzed the robustness of them. This could be very risky if these methods are attacked by an adversarial party. In this paper, we take the task of link prediction as an example, which is one of the most fundamental problems for graph analysis, and introduce a data positioning attack to node embedding methods. We give a complete characterization of attackers utilities and present efficient solutions to adversarial attacks for two popular node embedding methods: DeepWalk and LINE. We evaluate our proposed attack model on multiple real-world graphs. Experimental results show that our proposed model can significantly affect the results of link prediction by slightly changing the graph structures (e.g., adding or removing a few edges). We also show that our proposed model is very general and can be transferable across different embedding methods. Finally, we conduct a case study on a coauthor network to better understand our attack method.

التعلم الآلي التشفير والأمن الشبكات الاجتماعية والمعلومات

Learning Based Proximity Matrix Factorization for Node Embedding

180 - Xingyi Zhang , Kun Xie , Sibo Wang 2021

Node embedding learns a low-dimensional representation for each node in the graph. Recent progress on node embedding shows that proximity matrix factorization methods gain superb performance and scale to large graphs with millions of nodes. Existing approaches first define a proximity matrix and then learn the embeddings that fit the proximity by matrix factorization. Most existing matrix factorization methods adopt the same proximity for different tasks, while it is observed that different tasks and datasets may require different proximity, limiting their representation power. Motivated by this, we propose {em Lemane}, a framework with trainable proximity measures, which can be learned to best suit the datasets and tasks at hand automatically. Our method is end-to-end, which incorporates differentiable SVD in the pipeline so that the parameters can be trained via backpropagation. However, this learning process is still expensive on large graphs. To improve the scalability, we train proximity measures only on carefully subsampled graphs, and then apply standard proximity matrix factorization on the original graph using the learned proximity. Note that, computing the learned proximities for each pair is still expensive for large graphs, and existing techniques for computing proximities are not applicable to the learned proximities. Thus, we present generalized push techniques to make our solution scalable to large graphs with millions of nodes. Extensive experiments show that our proposed solution outperforms existing solutions on both link prediction and node classification tasks on almost all datasets.

التعلم الآلي