No Arabic abstract
Social data mining is an interesting phe-nomenon which colligates different sources of social data to extract information. This information can be used in relationship prediction, decision making, pat-tern recognition, social mapping, responsibility distri-bution and many other applications. This paper presents a systematical data mining architecture to mine intellectual knowledge from social data. In this research, we use social networking site facebook as primary data source. We collect different attributes such as about me, comments, wall post and age from facebook as raw data and use advanced data mining approaches to excavate intellectual knowledge. We also analyze our mined knowledge with comparison for possible usages like as human behavior prediction, pattern recognition, job responsibility distribution, decision making and product promoting.
Semi-supervised learning on graphs is an important problem in the machine learning area. In recent years, state-of-the-art classification methods based on graph neural networks (GNNs) have shown their superiority over traditional ones such as label propagation. However, the sophisticated architectures of these neural models will lead to a complex prediction mechanism, which could not make full use of valuable prior knowledge lying in the data, e.g., structurally correlated nodes tend to have the same class. In this paper, we propose a framework based on knowledge distillation to address the above issues. Our framework extracts the knowledge of an arbitrary learned GNN model (teacher model), and injects it into a well-designed student model. The student model is built with two simple prediction mechanisms, i.e., label propagation and feature transformation, which naturally preserves structure-based and feature-based prior knowledge, respectively. In specific, we design the student model as a trainable combination of parameterized label propagation and feature transformation modules. As a result, the learned student can benefit from both prior knowledge and the knowledge in GNN teachers for more effective predictions. Moreover, the learned student model has a more interpretable prediction process than GNNs. We conduct experiments on five public benchmark datasets and employ seven GNN models including GCN, GAT, APPNP, SAGE, SGC, GCNII and GLP as the teacher models. Experimental results show that the learned student model can consistently outperform its corresponding teacher model by 1.4% - 4.7% on average. Code and data are available at https://github.com/BUPT-GAMMA/CPF
The task of Knowledge Graph Completion (KGC) aims to automatically infer the missing fact information in Knowledge Graph (KG). In this paper, we take a new perspective that aims to leverage rich user-item interaction data (user interaction data for short) for improving the KGC task. Our work is inspired by the observation that many KG entities correspond to online items in application systems. However, the two kinds of data sources have very different intrinsic characteristics, and it is likely to hurt the original performance using simple fusion strategy. To address this challenge, we propose a novel adversarial learning approach by leveraging user interaction data for the KGC task. Our generator is isolated from user interaction data, and serves to improve the performance of the discriminator. The discriminator takes the learned useful information from user interaction data as input, and gradually enhances the evaluation capacity in order to identify the fake samples generated by the generator. To discover implicit entity preference of users, we design an elaborate collaborative learning algorithms based on graph neural networks, which will be jointly optimized with the discriminator. Such an approach is effective to alleviate the issues about data heterogeneity and semantic complexity for the KGC task. Extensive experiments on three real-world datasets have demonstrated the effectiveness of our approach on the KGC task.
Sources of commonsense knowledge support applications in natural language understanding, computer vision, and knowledge graphs. Given their complementarity, their integration is desired. Yet, their different foci, modeling approaches, and sparse overlap make integration difficult. In this paper, we consolidate commonsense knowledge by following five principles, which we apply to combine seven key sources into a first integrated CommonSense Knowledge Graph (CSKG). We analyze CSKG and its various text and graph embeddings, showing that CSKG is well-connected and that its embeddings provide a useful entry point to the graph. We demonstrate how CSKG can provide evidence for generalizable downstream reasoning and for pre-training of language models. CSKG and all its embeddings are made publicly available to support further research on commonsense knowledge integration and reasoning.
In the present day, more than 3.8 billion people around the world actively use social media. The effectiveness of social media in facilitating quick and easy sharing of information has attracted brands and advertizers who wish to use the platform to market products via the influencers in the network. Influencers, owing to their massive popularity, provide a huge potential customer base generating higher returns of investment in a very short period. However, it is not straightforward to decide which influencers should be selected for an advertizing campaign that can generate maximum returns with minimum investment. In this work, we present an agent-based model (ABM) that can simulate the dynamics of influencer advertizing campaigns in a variety of scenarios and can help to discover the best influencer marketing strategy. Our system is a probabilistic graph-based model that incorporates real-world factors such as customers interest in a product, customer behavior, the willingness to pay, a brands investment cap, influencers engagement with influence diffusion, and the nature of the product being advertized viz. luxury and non-luxury.
We study the extent to which we can infer users geographical locations from social media. Location inference from social media can benefit many applications, such as disaster management, targeted advertising, and news content tailoring. The challenges, however, lie in the limited amount of labeled data and the large scale of social networks. In this paper, we formalize the problem of inferring location from social media into a semi-supervised factor graph model (SSFGM). The model provides a probabilistic framework in which various sources of information (e.g., content and social network) can be combined together. We design a two-layer neural network to learn feature representations, and incorporate the learned latent features into SSFGM. To deal with the large-scale problem, we propose a Two-Chain Sampling (TCS) algorithm to learn SSFGM. The algorithm achieves a good trade-off between accuracy and efficiency. Experiments on Twitter and Weibo show that the proposed TCS algorithm for SSFGM can substantially improve the inference accuracy over several state-of-the-art methods. More importantly, TCS achieves over 100x speedup comparing with traditional propagation-based methods (e.g., loopy belief propagation).