ﻻ يوجد ملخص باللغة العربية
The damaging effects of hate speech on social media are evident during the last few years, and several organizations, researchers and social media platforms tried to harness them in various ways. Despite these efforts, social media users are still affected by hate speech. The problem is even more apparent to social groups that promote public discourse, such as journalists. In this work, we focus on countering hate speech that is targeted to journalistic social media accounts. To accomplish this, a group of journalists assembled a definition of hate speech, taking into account the journalistic point of view and the types of hate speech that are usually targeted against journalists. We then compile a large pool of tweets referring to journalism-related accounts in multiple languages. In order to annotate the pool of unlabeled tweets according to the definition, we follow a concise annotation strategy that involves active learning annotation stages. The outcome of this paper is a novel, publicly available collection of Twitter datasets in five different languages. Additionally, we experiment with state-of-the-art deep learning architectures for hate speech detection and use our annotated datasets to train and evaluate them. Finally, we propose an ensemble detection model that outperforms all individual models.
Hateful rhetoric is plaguing online discourse, fostering extreme societal movements and possibly giving rise to real-world violence. A potential solution to this growing global problem is citizen-generated counter speech where citizens actively engag
Islamophobic hate speech on social media inflicts considerable harm on both targeted individuals and wider society, and also risks reputational damage for the host platforms. Accordingly, there is a pressing need for robust tools to detect and classi
Model interpretability has become important to engenders appropriate user trust by providing the insight into the model prediction. However, most of the existing machine learning methods provide no interpretability for depression prediction, hence th
Most hate speech detection research focuses on a single language, generally English, which limits their generalisability to other languages. In this paper we investigate the cross-lingual hate speech detection task, tackling the problem by adapting t
Hate speech and profanity detection suffer from data sparsity, especially for languages other than English, due to the subjective nature of the tasks and the resulting annotation incompatibility of existing corpora. In this study, we identify profane