No Arabic abstract
Social reviews are indispensable resources for modern consumers decision making. For financial gain, companies pay fraudsters preferably in groups to demote or promote products and services since consumers are more likely to be misled by a large number of similar reviews from groups. Recent approaches on fraudster group detection employed handcrafted features of group behaviors without considering the semantic relation between reviews from the reviewers in a group. In this paper, we propose the first neural approach, HIN-RNN, a Heterogeneous Information Network (HIN) Compatible RNN for fraudster group detection that requires no handcrafted features. HIN-RNN provides a unifying architecture for representation learning of each reviewer, with the initial vector as the sum of word embeddings of all review text written by the same reviewer, concatenated by the ratio of negative reviews. Given a co-review network representing reviewers who have reviewed the same items with the same ratings and the reviewers vector representation, a collaboration matrix is acquired through HIN-RNN training. The proposed approach is confirmed to be effective with marked improvement over state-of-the-art approaches on both the Yelp (22% and 12% in terms of recall and F1-value, respectively) and Amazon (4% and 2% in terms of recall and F1-value, respectively) datasets.
We show that the classification performance of graph convolutional networks (GCNs) is related to the alignment between features, graph, and ground truth, which we quantify using a subspace alignment measure (SAM) corresponding to the Frobenius norm of the matrix of pairwise chordal distances between three subspaces associated with features, graph, and ground truth. The proposed measure is based on the principal angles between subspaces and has both spectral and geometrical interpretations. We showcase the relationship between the SAM and the classification performance through the study of limiting cases of GCNs and systematic randomizations of both features and graph structure applied to a constructive example and several examples of citation networks of different origins. The analysis also reveals the relative importance of the graph and features for classification purposes.
The recent GRAPH-BERT model introduces a new approach to learning graph representations merely based on the attention mechanism. GRAPH-BERT provides an opportunity for transferring pre-trained models and learned graph representations across different tasks within the same graph dataset. In this paper, we will further investigate the graph-to-graph transfer of a universal GRAPH-BERT for graph representation learning across different graph datasets, and our proposed model is also referred to as the G5 for simplicity. Many challenges exist in learning G5 to adapt the distinct input and output configurations for each graph data source, as well as the information distributions differences. G5 introduces a pluggable model architecture: (a) each data source will be pre-processed with a unique input representation learning component; (b) each output application task will also have a specific functional component; and (c) all such diverse input and output components will all be conjuncted with a universal GRAPH-BERT core component via an input size unification layer and an output representation fusion layer, respectively. The G5 model removes the last obstacle for cross-graph representation learning and transfer. For the graph sources with very sparse training data, the G5 model pre-trained on other graphs can still be utilized for representation learning with necessary fine-tuning. Whats more, the architecture of G5 also allows us to learn a supervised functional classifier for data sources without any training data at all. Such a problem is also named as the Apocalypse Learning task in this paper. Two different label reasoning strategies, i.e., Cross-Source Classification Consistency Maximization (CCCM) and Cross-Source Dynamic Routing (CDR), are introduced in this paper to address the problem.
Graph neural networks for heterogeneous graph embedding is to project nodes into a low-dimensional space by exploring the heterogeneity and semantics of the heterogeneous graph. However, on the one hand, most of existing heterogeneous graph embedding methods either insufficiently model the local structure under specific semantic, or neglect the heterogeneity when aggregating information from it. On the other hand, representations from multiple semantics are not comprehensively integrated to obtain versatile node embeddings. To address the problem, we propose a Heterogeneous Graph Neural Network with Multi-View Representation Learning (named MV-HetGNN) for heterogeneous graph embedding by introducing the idea of multi-view representation learning. The proposed model consists of node feature transformation, view-specific ego graph encoding and auto multi-view fusion to thoroughly learn complex structural and semantic information for generating comprehensive node representations. Extensive experiments on three real-world heterogeneous graph datasets show that the proposed MV-HetGNN model consistently outperforms all the state-of-the-art GNN baselines in various downstream tasks, e.g., node classification, node clustering, and link prediction.
The recurrent network architecture is a widely used model in sequence modeling, but its serial dependency hinders the computation parallelization, which makes the operation inefficient. The same problem was encountered in serial adder at the early stage of digital electronics. In this paper, we discuss the similarities between recurrent neural network (RNN) and serial adder. Inspired by carry-lookahead adder, we introduce carry-lookahead module to RNN, which makes it possible for RNN to run in parallel. Then, we design the method of parallel RNN computation, and finally Carry-lookahead RNN (CL-RNN) is proposed. CL-RNN takes advantages in parallelism and flexible receptive field. Through a comprehensive set of tests, we verify that CL-RNN can perform better than existing typical RNNs in sequence modeling tasks which are specially designed for RNNs.
Graph distance metric learning serves as the foundation for many graph learning problems, e.g., graph clustering, graph classification and graph matching. Existing research works on graph distance metric (or graph kernels) learning fail to maintain the basic properties of such metrics, e.g., non-negative, identity of indiscernibles, symmetry and triangle inequality, respectively. In this paper, we will introduce a new graph neural network based distance metric learning approaches, namely GB-DISTANCE (GRAPH-BERT based Neural Distance). Solely based on the attention mechanism, GB-DISTANCE can learn graph instance representations effectively based on a pre-trained GRAPH-BERT model. Different from the existing supervised/unsupervised metrics, GB-DISTANCE can be learned effectively in a semi-supervised manner. In addition, GB-DISTANCE can also maintain the distance metric basic properties mentioned above. Extensive experiments have been done on several benchmark graph datasets, and the results demonstrate that GB-DISTANCE can out-perform the existing baseline methods, especially the recent graph neural network model based graph metrics, with a significant gap in computing the graph distance.