Embracing Domain Differences in Fake News: Cross-domain Fake News Detection using Multi-modal Data

143 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Amila Silva

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Amila Silva - Ling Luo - Shanika Karunasekera

الحساب واللغة استرجاع المعلومات التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

With the rapid evolution of social media, fake news has become a significant social problem, which cannot be addressed in a timely manner using manual investigation. This has motivated numerous studies on automating fake news detection. Most studies explore supervised training models with different modalities (e.g., text, images, and propagation networks) of news records to identify fake news. However, the performance of such techniques generally drops if news records are coming from different domains (e.g., politics, entertainment), especially for domains that are unseen or rarely-seen during training. As motivation, we empirically show that news records from different domains have significantly different word usage and propagation patterns. Furthermore, due to the sheer volume of unlabelled news records, it is challenging to select news records for manual labelling so that the domain-coverage of the labelled dataset is maximized. Hence, this work: (1) proposes a novel framework that jointly preserves domain-specific and cross-domain knowledge in news records to detect fake news from different domains; and (2) introduces an unsupervised technique to select a set of unlabelled informative news records for manual labelling, which can be ultimately used to train a fake news detection model that performs well for many domains while minimizing the labelling cost. Our experiments show that the integration of the proposed fake news model and the selective annotation approach achieves state-of-the-art performance for cross-domain news datasets, while yielding notable improvements for rarely-appearing domains in news datasets.

قيم البحث

147 - Xinyi Zhou , Jindi Wu , Reza Zafarani 2020

Effective detection of fake news has recently attracted significant attention. Current studies have made significant contributions to predicting fake news with less focus on exploiting the relationship (similarity) between the textual and visual info rmation in news articles. Attaching importance to such similarity helps identify fake news stories that, for example, attempt to use irrelevant images to attract readers attention. In this work, we propose a $mathsf{S}$imilarity-$mathsf{A}$ware $mathsf{F}$ak$mathsf{E}$ news detection method ($mathsf{SAFE}$) which investigates multi-modal (textual and visual) information of news articles. First, neural networks are adopted to separately extract textual and visual features for news representation. We further investigate the relationship between the extracted features across modalities. Such representations of news textual and visual information along with their relationship are jointly learned and used to predict fake news. The proposed method facilitates recognizing the falsity of news articles based on their text, images, or their mismatches. We conduct extensive experiments on large-scale real-world data, which demonstrate the effectiveness of the proposed method.

الحساب واللغة الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Knowledge Enhanced Multi-modal Fake News Detection

93 - Yi Han , Amila Silva , Ling Luo 2021

Recent years have witnessed the significant damage caused by various types of fake news. Although considerable effort has been applied to address this issue and much progress has been made on detecting fake news, most existing approaches mainly rely on the textual content and/or social context, while knowledge-level information---entities extracted from the news content and the relations between them---is much less explored. Within the limited work on knowledge-based fake news detection, an external knowledge graph is often required, which may introduce additional problems: it is quite common for entities and relations, especially with respect to new concepts, to be missing in existing knowledge graphs, and both entity prediction and link prediction are open research questions themselves. Therefore, in this work, we investigate textbf{knowledge-based fake news detection that does not require any external knowledge graph.} Specifically, our contributions include: (1) transforming the problem of detecting fake news into a subgraph classification task---entities and relations are extracted from each news item to form a single knowledge graph, where a news item is represented by a subgraph. Then a graph neural network (GNN) model is trained to classify each subgraph/news item. (2) Further improving the performance of this model through a simple but effective multi-modal technique that combines extracted knowledge, textual content and social context. Experiments on multiple datasets with thousands of labelled news items demonstrate that our knowledge-based algorithm outperforms existing counterpart methods, and its performance can be further boosted by the multi-modal approach.

الشبكات الاجتماعية والمعلومات

Credibility-based Fake News Detection

172 - Niraj Sitaula , Chilukuri K. Mohan , Jennifer Grygiel 2019

Fake news can significantly misinform people who often rely on online sources and social media for their information. Current research on fake news detection has mostly focused on analyzing fake news content and how it propagates on a network of user s. In this paper, we emphasize the detection of fake news by assessing its credibility. By analyzing public fake news data, we show that information on news sources (and authors) can be a strong indicator of credibility. Our findings suggest that an authors history of association with fake news, and the number of authors of a news article, can play a significant role in detecting fake news. Our approach can help improve traditional fake news detection methods, wherein content features are often used to detect fake news.

الحساب واللغة الشبكات الاجتماعية والمعلومات

A Benchmark Study of Machine Learning Models for Online Fake News Detection

84 - Junaed Younus Khan , Md. Tawkat Islam Khondaker , Sadia Afroz 2019

The proliferation of fake news and its propagation on social media has become a major concern due to its ability to create devastating impacts. Different machine learning approaches have been suggested to detect fake news. However, most of those focu sed on a specific type of news (such as political) which leads us to the question of dataset-bias of the models used. In this research, we conducted a benchmark study to assess the performance of different applicable machine learning approaches on three different datasets where we accumulated the largest and most diversified one. We explored a number of advanced pre-trained language models for fake news detection along with the traditional and deep learning ones and compared their performances from different aspects for the first time to the best of our knowledge. We find that BERT and similar pre-trained models perform the best for fake news detection, especially with very small dataset. Hence, these models are significantly better option for languages with limited electronic contents, i.e., training data. We also carried out several analysis based on the models performance, articles topic, articles length, and discussed different lessons learned from them. We believe that this benchmark study will help the research community to explore further and news sites/blogs to select the most appropriate fake news detection method.

الحساب واللغة استرجاع المعلومات التعلم الآلي

Is it Fake? News Disinformation Detection on South African News Websites

86 - Harm de Wet , Vukosi Marivate 2021

Disinformation through fake news is an ongoing problem in our society and has become easily spread through social media. The most cost and time effective way to filter these large amounts of data is to use a combination of human and technical interve ntions to identify it. From a technical perspective, Natural Language Processing (NLP) is widely used in detecting fake news. Social media companies use NLP techniques to identify the fake news and warn their users, but fake news may still slip through undetected. It is especially a problem in more localised contexts (outside the United States of America). How do we adjust fake news detection systems to work better for local contexts such as in South Africa. In this work we investigate fake news detection on South African websites. We curate a dataset of South African fake news and then train detection models. We contrast this with using widely available fake news datasets (from mostly USA website). We also explore making the datasets more diverse by combining them and observe the differences in behaviour in writing between nations fake news using interpretable machine learning.

الحساب واللغة أجهزة الكمبيوتر والمجتمع التعلم الآلي