ترغب بنشر مسار تعليمي؟ اضغط هنا

Multiple Accounts Detection on Facebook Using Semi-Supervised Learning on Graphs

69   0   0.0 ( 0 )
 نشر من قبل Xiaoyun Wang
 تاريخ النشر 2018
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

In social networks, a single user may create multiple accounts to spread his / her opinions and to influence others, by actively comment on different news pages. It would be beneficial to both social networks and their communities, to demote such abnormal activities, and the first step is to detect those accounts. However, the detection is challenging, because these accounts may have very realistic names and reasonable activity patterns. In this paper, we investigate three different approaches, and propose using graph embedding together with semi-supervised learning, to predict whether a pair of accounts are created by the same user. We carry out extensive experimental analyses to understand how changes in the input data and algorithmic parameters / optimization affect the prediction performance. We also discover that local information have higher importance than the global ones for such prediction, and point out the threshold leading to the best results. We test the proposed approach with 6700 Facebook pages from the Middle East, and achieve the averaged accuracy at 0.996 and AUC (area under curve) at 0.952 for users with the same name; with the U.S. 2016 election dataset, we obtain the best AUC at 0.877 for users with different names.



قيم البحث

اقرأ أيضاً

The objective of active learning (AL) is to train classification models with less number of labeled instances by selecting only the most informative instances for labeling. The AL algorithms designed for other data types such as images and text do no t perform well on graph-structured data. Although a few heuristics-based AL algorithms have been proposed for graphs, a principled approach is lacking. In this paper, we propose MetAL, an AL approach that selects unlabeled instances that directly improve the future performance of a classification model. For a semi-supervised learning problem, we formulate the AL task as a bilevel optimization problem. Based on recent work in meta-learning, we use the meta-gradients to approximate the impact of retraining the model with any unlabeled instance on the model performance. Using multiple graph datasets belonging to different domains, we demonstrate that MetAL efficiently outperforms existing state-of-the-art AL algorithms.
Classification tasks based on feature vectors can be significantly improved by including within deep learning a graph that summarises pairwise relationships between the samples. Intuitively, the graph acts as a conduit to channel and bias the inferen ce of class labels. Here, we study classification methods that consider the graph as the originator of an explicit graph diffusion. We show that appending graph diffusion to feature-based learning as an textit{a posteriori} refinement achieves state-of-the-art classification accuracy. This method, which we call Graph Diffusion Reclassification (GDR), uses overshooting events of a diffusive graph dynamics to reclassify individual nodes. The method uses intrinsic measures of node influence, which are distinct for each node, and allows the evaluation of the relationship and importance of features and graph for classification. We also present diff-GCN, a simple extension of Graph Convolutional Neural Network (GCN) architectures that leverages explicit diffusion dynamics, and allows the natural use of directed graphs. To showcase our methods, we use benchmark datasets of documents with associated citation data.
104 - Yucen Luo , Jun Zhu , Mengxi Li 2017
The recently proposed self-ensembling methods have achieved promising results in deep semi-supervised learning, which penalize inconsistent predictions of unlabeled data under different perturbations. However, they only consider adding perturbations to each single data point, while ignoring the connections between data samples. In this paper, we propose a novel method, called Smooth Neighbors on Teacher Graphs (SNTG). In SNTG, a graph is constructed based on the predictions of the teacher model, i.e., the implicit self-ensemble of models. Then the graph serves as a similarity measure with respect to which the representations of similar neighboring points are learned to be smooth on the low-dimensional manifold. We achieve state-of-the-art results on semi-supervised learning benchmarks. The error rates are 9.89%, 3.99% for CIFAR-10 with 4000 labels, SVHN with 500 labels, respectively. In particular, the improvements are significant when the labels are fewer. For the non-augmented MNIST with only 20 labels, the error rate is reduced from previous 4.81% to 1.36%. Our method also shows robustness to noisy labels.
We study the problem of semi-supervised learning on graphs, for which graph neural networks (GNNs) have been extensively explored. However, most existing GNNs inherently suffer from the limitations of over-smoothing, non-robustness, and weak-generali zation when labeled nodes are scarce. In this paper, we propose a simple yet effective framework---GRAPH RANDOM NEURAL NETWORKS (GRAND)---to address these issues. In GRAND, we first design a random propagation strategy to perform graph data augmentation. Then we leverage consistency regularization to optimize the prediction consistency of unlabeled nodes across different data augmentations. Extensive experiments on graph benchmark datasets suggest that GRAND significantly outperforms state-of-the-art GNN baselines on semi-supervised node classification. Finally, we show that GRAND mitigates the issues of over-smoothing and non-robustness, exhibiting better generalization behavior than existing GNNs. The source code of GRAND is publicly available at https://github.com/Grand20/grand.
On social media algorithms for content promotion, accounting for users preferences, might limit the exposure to unsolicited contents. In this work, we study how the same contents (videos) are consumed on different platforms -- i.e. Facebook and YouTu be -- over a sample of $12M$ of users. Our findings show that the same content lead to the formation of echo chambers, irrespective of the online social network and thus of the algorithm for content promotion. Finally, we show that the users commenting patterns are accurate early predictors for the formation of echo-chambers.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا