No Arabic abstract
Community detection, aiming to group the graph nodes into clusters with dense inner-connection, is a fundamental graph mining task. Recently, it has been studied on the heterogeneous graph, which contains multiple types of nodes and edges, posing great challenges for modeling the high-order relationship between nodes. With the surge of graph embedding mechanism, it has also been adopted to community detection. A remarkable group of works use the meta-path to capture the high-order relationship between nodes and embed them into nodes embedding to facilitate community detection. However, defining meaningful meta-paths requires much domain knowledge, which largely limits their applications, especially on schema-rich heterogeneous graphs like knowledge graphs. To alleviate this issue, in this paper, we propose to exploit the context path to capture the high-order relationship between nodes, and build a Context Path-based Graph Neural Network (CP-GNN) model. It recursively embeds the high-order relationship between nodes into the node embedding with attention mechanisms to discriminate the importance of different relationships. By maximizing the expectation of the co-occurrence of nodes connected by context paths, the model can learn the nodes embeddings that both well preserve the high-order relationship between nodes and are helpful for community detection. Extensive experimental results on four real-world datasets show that CP-GNN outperforms the state-of-the-art community detection methods.
Fake news, false or misleading information presented as news, has a great impact on many aspects of society, such as politics and healthcare. To handle this emerging problem, many fake news detection methods have been proposed, applying Natural Language Processing (NLP) techniques on the article text. Considering that even people cannot easily distinguish fake news by news content, these text-based solutions are insufficient. To further improve fake news detection, researchers suggested graph-based solutions, utilizing the social context information such as user engagement or publishers information. However, existing graph-based methods still suffer from the following four major drawbacks: 1) expensive computational cost due to a large number of user nodes in the graph, 2) the error in sub-tasks, such as textual encoding or stance detection, 3) loss of rich social context due to homogeneous representation of news graphs, and 4) the absence of temporal information utilization. In order to overcome the aforementioned issues, we propose a novel social context aware fake news detection method, Hetero-SCAN, based on a heterogeneous graph neural network. Hetero-SCAN learns the news representation from the heterogeneous graph of news in an end-to-end manner. We demonstrate that Hetero-SCAN yields significant improvement over state-of-the-art text-based and graph-based fake news detection methods in terms of performance and efficiency.
The purpose of the Session-Based Recommendation System is to predict the users next click according to the previous session sequence. The current studies generally learn user preferences according to the transitions of items in the users session sequence. However, other effective information in the session sequence, such as user profiles, are largely ignored which may lead to the model unable to learn the users specific preferences. In this paper, we propose a heterogeneous graph neural network-based session recommendation method, named SR-HetGNN, which can learn session embeddings by heterogeneous graph neural network (HetGNN), and capture the specific preferences of anonymous users. Specifically, SR-HetGNN first constructs heterogeneous graphs containing various types of nodes according to the session sequence, which can capture the dependencies among items, users, and sessions. Second, HetGNN captures the complex transitions between items and learns the item embeddings containing user information. Finally, to consider the influence of users long and short-term preferences, local and global session embeddings are combined with the attentional network to obtain the final session embedding. SR-HetGNN is shown to be superior to the existing state-of-the-art session-based recommendation methods through extensive experiments over two real large datasets Diginetica and Tmall.
Predicting the start-ups that will eventually succeed is essentially important for the venture capital business and worldwide policy makers, especially at an early stage such that rewards can possibly be exponential. Though various empirical studies and data-driven modeling work have been done, the predictive power of the complex networks of stakeholders including venture capital investors, start-ups, and start-ups managing members has not been thoroughly explored. We design an incremental representation learning mechanism and a sequential learning model, utilizing the network structure together with the rich attributes of the nodes. In general, our method achieves the state-of-the-art prediction performance on a comprehensive dataset of global venture capital investments and surpasses human investors by large margins. Specifically, it excels at predicting the outcomes for start-ups in industries such as healthcare and IT. Meanwhile, we shed light on impacts on start-up success from observable factors including gender, education, and networking, which can be of value for practitioners as well as policy makers when they screen ventures of high growth potentials.
The real-world networks often compose of different types of nodes and edges with rich semantics, widely known as heterogeneous information network (HIN). Heterogeneous network embedding aims to embed nodes into low-dimensional vectors which capture rich intrinsic information of heterogeneous networks. However, existing models either depend on manually designing meta-paths, ignore mutual effects between different semantics, or omit some aspects of information from global networks. To address these limitations, we propose a novel Graph-Aggregated Heterogeneous Network Embedding (GAHNE), which is designed to extract the semantics of HINs as comprehensively as possible to improve the results of downstream tasks based on graph convolutional neural networks. In GAHNE model, we develop several mechanisms that can aggregate semantic representations from different single-type sub-networks as well as fuse the global information into final embeddings. Extensive experiments on three real-world HIN datasets show that our proposed model consistently outperforms the existing state-of-the-art methods.
A large number of real-world graphs or networks are inherently heterogeneous, involving a diversity of node types and relation types. Heterogeneous graph embedding is to embed rich structural and semantic information of a heterogeneous graph into low-dimensional node representations. Existing models usually define multiple metapaths in a heterogeneous graph to capture the composite relations and guide neighbor selection. However, these models either omit node content features, discard intermediate nodes along the metapath, or only consider one metapath. To address these three limitations, we propose a new model named Metapath Aggregated Graph Neural Network (MAGNN) to boost the final performance. Specifically, MAGNN employs three major components, i.e., the node content transformation to encapsulate input node attributes, the intra-metapath aggregation to incorporate intermediate semantic nodes, and the inter-metapath aggregation to combine messages from multiple metapaths. Extensive experiments on three real-world heterogeneous graph datasets for node classification, node clustering, and link prediction show that MAGNN achieves more accurate prediction results than state-of-the-art baselines.