ﻻ يوجد ملخص باللغة العربية
Recently, heterogeneous Graph Neural Networks (GNNs) have become a de facto model for analyzing HGs, while most of them rely on a relative large number of labeled data. In this work, we investigate Contrastive Learning (CL), a key component in self-supervised approaches, on HGs to alleviate the label scarcity problem. We first generate multiple semantic views according to metapaths and network schemas. Then, by pushing node embeddings corresponding to different semantic views close to each other (positives) and pulling other embeddings apart (negatives), one can obtain informative representations without human annotations. However, this CL approach ignores the relative hardness of negative samples, which may lead to suboptimal performance. Considering the complex graph structure and the smoothing nature of GNNs, we propose a structure-aware hard negative mining scheme that measures hardness by structural characteristics for HGs. By synthesizing more negative nodes, we give larger weights to harder negatives with limited computational overhead to further boost the performance. Empirical studies on three real-world datasets show the effectiveness of our proposed method. The proposed method consistently outperforms existing state-of-the-art methods and notably, even surpasses several supervised counterparts.
How can you sample good negative examples for contrastive learning? We argue that, as with metric learning, contrastive learning of representations benefits from hard negative samples (i.e., points that are difficult to distinguish from an anchor poi
Leveraging domain knowledge including fingerprints and functional groups in molecular representation learning is crucial for chemical property prediction and drug discovery. When modeling the relation between graph structure and molecular properties
Graph Contrastive Learning (GCL) establishes a new paradigm for learning graph representations without human annotations. Although remarkable progress has been witnessed recently, the success behind GCL is still left somewhat mysterious. In this work
With the advent of big data across multiple high-impact applications, we are often facing the challenge of complex heterogeneity. The newly collected data usually consist of multiple modalities and characterized with multiple labels, thus exhibiting
Representation learning on heterogeneous graphs aims to obtain meaningful node representations to facilitate various downstream tasks, such as node classification and link prediction. Existing heterogeneous graph learning methods are primarily develo