No Arabic abstract
Network representation learning (NRL) plays a vital role in a variety of tasks such as node classification and link prediction. It aims to learn low-dimensional vector representations for nodes based on network structures or node attributes. While embedding techniques on complete networks have been intensively studied, in real-world applications, it is still a challenging task to collect complete networks. To bridge the gap, in this paper, we propose a Deep Incomplete Network Embedding method, namely DINE. Specifically, we first complete the missing part including both nodes and edges in a partially observable network by using the expectation-maximization framework. To improve the embedding performance, we consider both network structures and node attributes to learn node representations. Empirically, we evaluate DINE over three networks on multi-label classification and link prediction tasks. The results demonstrate the superiority of our proposed approach compared against state-of-the-art baselines.
Attributed networks are ubiquitous since a network often comes with auxiliary attribute information e.g. a social network with user profiles. Attributed Network Embedding (ANE) has recently attracted considerable attention, which aims to learn unified low dimensional node embeddings while preserving both structural and attribute information. The resulting node embeddings can then facilitate various network downstream tasks e.g. link prediction. Although there are several ANE methods, most of them cannot deal with incomplete attributed networks with missing links and/or missing node attributes, which often occur in real-world scenarios. To address this issue, we propose a robust ANE method, the general idea of which is to reconstruct a unified denser network by fusing two sources of information for information enhancement, and then employ a random walks based network embedding method for learning node embeddings. The experiments of link prediction, node classification, visualization, and parameter sensitivity analysis on six real-world datasets validate the effectiveness of our method to incomplete attributed networks.
The behaviour of information cascades (such as retweets) has been modelled extensively. While point process-based generative models have long been in use for estimating cascade growths, deep learning has greatly enhanced diverse feature integration. We observe two significant temporal signals in cascade data that have not been emphasized or reported to our knowledge. First, the popularity of the cascade root is known to influence cascade size strongly; but the effect falls off rapidly with time. Second, there is a measurable positive correlation between the novelty of the root content (with respect to a streaming external corpus) and the relative size of the resulting cascade. Responding to these observations, we propose GammaCas, a new cascade growth model as a parametric function of time, which combines deep influence signals from content (e.g., tweet text), network features (e.g., followers of the root user), and exogenous event sources (e.g., online news). Specifically, our model processes these signals through a customized recurrent network, whose states then provide the parameters of the cascade rate function, which is integrated over time to predict the cascade size. The network parameters are trained end-to-end using observed cascades. GammaCas outperforms seven recent and diverse baselines significantly on a large-scale dataset of retweet cascades coupled with time-aligned online news -- it beats the best baseline with an 18.98% increase in terms of Kendalls $tau$ correlation and $35.63$ reduction in Mean Absolute Percentage Error. Extensive ablation and case studies unearth interesting insights regarding retweet cascade dynamics.
Recently, information cascade prediction has attracted increasing interest from researchers, but it is far from being well solved partly due to the three defects of the existing works. First, the existing works often assume an underlying information diffusion model, which is impractical in real world due to the complexity of information diffusion. Second, the existing works often ignore the prediction of the infection order, which also plays an important role in social network analysis. At last, the existing works often depend on the requirement of underlying diffusion networks which are likely unobservable in practice. In this paper, we aim at the prediction of both node infection and infection order without requirement of the knowledge about the underlying diffusion mechanism and the diffusion network, where the challenges are two-fold. The first is what cascading characteristics of nodes should be captured and how to capture them, and the second is that how to model the non-linear features of nodes in information cascades. To address these challenges, we propose a novel model called Deep Collaborative Embedding (DCE) for information cascade prediction, which can capture not only the node structural property but also two kinds of node cascading characteristics. We propose an auto-encoder based collaborative embedding framework to learn the node embeddings with cascade collaboration and node collaboration, in which way the non-linearity of information cascades can be effectively captured. The results of extensive experiments conducted on real-world datasets verify the effectiveness of our approach.
Aiming at better representing multivariate relationships, this paper investigates a motif dimensional framework for higher-order graph learning. The graph learning effectiveness can be improved through OFFER. The proposed framework mainly aims at accelerating and improving higher-order graph learning results. We apply the acceleration procedure from the dimensional of network motifs. Specifically, the refined degree for nodes and edges are conducted in two stages: (1) employ motif degree of nodes to refine the adjacency matrix of the network; and (2) employ motif degree of edges to refine the transition probability matrix in the learning process. In order to assess the efficiency of the proposed framework, four popular network representation algorithms are modified and examined. By evaluating the performance of OFFER, both link prediction results and clustering results demonstrate that the graph representation learning algorithms enhanced with OFFER consistently outperform the original algorithms with higher efficiency.
Signed network embedding is an approach to learn low-dimensional representations of nodes in signed networks with both positive and negative links, which facilitates downstream tasks such as link prediction with general data mining frameworks. Due to the distinct properties and significant added value of negative links, existing signed network embedding methods usually design dedicated methods based on social theories such as balance theory and status theory. However, existing signed network embedding methods ignore the characteristics of multiple facets of each node and mix them up in one single representation, which limits the ability to capture the fine-grained attentions between node pairs. In this paper, we propose MUSE, a MUlti-faceted attention-based Signed network Embedding framework to tackle this problem. Specifically, a joint intra- and inter-facet attention mechanism is introduced to aggregate fine-grained information from neighbor nodes. Moreover, balance theory is also utilized to guide information aggregation from multi-order balanced and unbalanced neighbors. Experimental results on four real-world signed network datasets demonstrate the effectiveness of our proposed framework.