ﻻ يوجد ملخص باللغة العربية
Media plays an important role in shaping public opinion. Biased media can influence people in undesirable directions and hence should be unmasked as such. We observe that featurebased and neural text classification approaches which rely only on the distribution of low-level lexical information fail to detect media bias. This weakness becomes most noticeable for articles on new events, where words appear in new contexts and hence their bias predictiveness is unclear. In this paper, we therefore study how second-order information about biased statements in an article helps to improve detection effectiveness. In particular, we utilize the probability distributions of the frequency, positions, and sequential order of lexical and informational sentence-level bias in a Gaussian Mixture Model. On an existing media bias dataset, we find that the frequency and positions of biased statements strongly impact article-level bias, whereas their exact sequential order is secondary. Using a standard model for sentence-level bias detection, we provide empirical evidence that article-level bias detectors that use second-order information clearly outperform those without.
Media organizations bear great reponsibility because of their considerable influence on shaping beliefs and positions of our society. Any form of media can contain overly biased content, e.g., by reporting on political events in a selective or incomp
In this work, we introduce a corpus for satire detection in Romanian news. We gathered 55,608 public news articles from multiple real and satirical news sources, composing one of the largest corpora for satire detection regardless of language and the
News is a central source of information for individuals to inform themselves on current topics. Knowing a news articles slant and authenticity is of crucial importance in times of fake news, news bots, and centralization of media ownership. We introd
Fake news causes significant damage to society.To deal with these fake news, several studies on building detection models and arranging datasets have been conducted. Most of the fake news datasets depend on a specific time period. Consequently, the d
Leveraging unlabelled data through weak or distant supervision is a compelling approach to developing more effective text classification models. This paper proposes a simple but effective data augmentation method, which leverages the idea of pseudo-l