Do Transformers Really Perform Bad for Graph Representation?

212 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Shuxin Zheng

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Chengxuan Ying - Tianle Cai - Shengjie Luo

التعلم الآلي الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The Transformer architecture has become a dominant choice in many domains, such as natural language processing and computer vision. Yet, it has not achieved competitive performance on popular leaderboards of graph-level prediction compared to mainstream GNN variants. Therefore, it remains a mystery how Transformers could perform well for graph representation learning. In this paper, we solve this mystery by presenting Graphormer, which is built upon the standard Transformer architecture, and could attain excellent results on a broad range of graph representation learning tasks, especially on the recent OGB Large-Scale Challenge. Our key insight to utilizing Transformer in the graph is the necessity of effectively encoding the structural information of a graph into the model. To this end, we propose several simple yet effective structural encoding methods to help Graphormer better model graph-structured data. Besides, we mathematically characterize the expressive power of Graphormer and exhibit that with our ways of encoding the structural information of graphs, many popular GNN variants could be covered as the special cases of Graphormer.

قيم البحث

اقرأ أيضاً

Graph Kernel Attention Transformers

90 - Krzysztof Choromanski , Han Lin , Haoxian Chen 2021

We introduce a new class of graph neural networks (GNNs), by combining several concepts that were so far studied independently - graph kernels, attention-based networks with structural priors and more recently, efficient Transformers architectures ap plying small memory footprint implicit attention methods via low rank decomposition techniques. The goal of the paper is twofold. Proposed by us Graph Kernel Attention Transformers (or GKATs) are much more expressive than SOTA GNNs as capable of modeling longer-range dependencies within a single layer. Consequently, they can use more shallow architecture design. Furthermore, GKAT attention layers scale linearly rather than quadratically in the number of nodes of the input graphs, even when those graphs are dense, requiring less compute than their regular graph attention counterparts. They achieve it by applying new classes of graph kernels admitting random feature map decomposition via random walks on graphs. As a byproduct of the introduced techniques, we obtain a new class of learnable graph sketches, called graphots, compactly encoding topological graph properties as well as nodes features. We conducted exhaustive empirical comparison of our method with nine different GNN classes on tasks ranging from motif detection through social network classification to bioinformatics challenges, showing consistent gains coming from GKATs.

التعلم الآلي الذكاء الاصطناعي

Do Language Models Perform Generalizable Commonsense Inference?

88 - Peifeng Wang , Filip Ilievski , Muhao Chen 2021

Inspired by evidence that pretrained language models (LMs) encode commonsense knowledge, recent work has applied LMs to automatically populate commonsense knowledge graphs (CKGs). However, there is a lack of understanding on their generalization to m ultiple CKGs, unseen relations, and novel entities. This paper analyzes the ability of LMs to perform generalizable commonsense inference, in terms of knowledge capacity, transferability, and induction. Our experiments with these three aspects show that: (1) LMs can adapt to different schemas defined by multiple CKGs but fail to reuse the knowledge to generalize to new relations. (2) Adapted LMs generalize well to unseen subjects, but less so on novel objects. Future work should investigate how to improve the transferability and induction of commonsense mining from LMs.

الحساب واللغة الذكاء الاصطناعي

Towards Expressive Graph Representation

90 - Chengsheng Mao , Liang Yao , Yuan Luo 2020

Graph Neural Network (GNN) aggregates the neighborhood of each node into the node embedding and shows its powerful capability for graph representation learning. However, most existing GNN variants aggregate the neighborhood information in a fixed non -injective fashion, which may map different graphs or nodes to the same embedding, reducing the model expressiveness. We present a theoretical framework to design a continuous injective set function for neighborhood aggregation in GNN. Using the framework, we propose expressive GNN that aggregates the neighborhood of each node with a continuous injective set function, so that a GNN layer maps similar nodes with similar neighborhoods to similar embeddings, different nodes to different embeddings and the equivalent nodes or isomorphic graphs to the same embeddings. Moreover, the proposed expressive GNN can naturally learn expressive representations for graphs with continuous node attributes. We validate the proposed expressive GNN (ExpGNN) for graph classification on multiple benchmark datasets including simple graphs and attributed graphs. The experimental results demonstrate that our model achieves state-of-the-art performances on most of the benchmarks.

التعلم الآلي الذكاء الاصطناعي نظرية المعلومات

CommPOOL: An Interpretable Graph Pooling Framework for Hierarchical Graph Representation Learning

94 - Haoteng Tang , Guixiang Ma , Lifang He 2020

Recent years have witnessed the emergence and flourishing of hierarchical graph pooling neural networks (HGPNNs) which are effective graph representation learning approaches for graph level tasks such as graph classification. However, current HGPNNs do not take full advantage of the graphs intrinsic structures (e.g., community structure). Moreover, the pooling operations in existing HGPNNs are difficult to be interpreted. In this paper, we propose a new interpretable graph pooling framework - CommPOOL, that can capture and preserve the hierarchical community structure of graphs in the graph representation learning process. Specifically, the proposed community pooling mechanism in CommPOOL utilizes an unsupervised approach for capturing the inherent community structure of graphs in an interpretable manner. CommPOOL is a general and flexible framework for hierarchical graph representation learning that can further facilitate various graph-level tasks. Evaluations on five public benchmark datasets and one synthetic dataset demonstrate the superior performance of CommPOOL in graph representation learning for graph classification compared to the state-of-the-art baseline methods, and its effectiveness in capturing and preserving the community structure of graphs.

التعلم الآلي الذكاء الاصطناعي

Decentralized Knowledge Graph Representation Learning

166 - Lingbing Guo , Weiqing Wang , Zequn Sun 2020

Knowledge graph (KG) representation learning methods have achieved competitive performance in many KG-oriented tasks, among which the best ones are usually based on graph neural networks (GNNs), a powerful family of networks that learns the represent ation of an entity by aggregating the features of its neighbors and itself. However, many KG representation learning scenarios only provide the structure information that describes the relationships among entities, causing that entities have no input features. In this case, existing aggregation mechanisms are incapable of inducing embeddings of unseen entities as these entities have no pre-defined features for aggregation. In this paper, we present a decentralized KG representation learning approach, decentRL, which encodes each entity from and only from the embeddings of its neighbors. For optimization, we design an algorithm to distill knowledge from the model itself such that the output embeddings can continuously gain knowledge from the corresponding original embeddings. Extensive experiments show that the proposed approach performed better than many cutting-edge models on the entity alignment task, and achieved competitive performance on the entity prediction task. Furthermore, under the inductive setting, it significantly outperformed all baselines on both tasks.

التعلم الآلي الذكاء الاصطناعي الحساب واللغة