ترغب بنشر مسار تعليمي؟ اضغط هنا

Designing Toxic Content Classification for a Diversity of Perspectives

212   0   0.0 ( 0 )
 نشر من قبل Deepak Kumar
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

In this work, we demonstrate how existing classifiers for identifying toxic comments online fail to generalize to the diverse concerns of Internet users. We survey 17,280 participants to understand how user expectations for what constitutes toxic content differ across demographics, beliefs, and personal experiences. We find that groups historically at-risk of harassment - such as people who identify as LGBTQ+ or young adults - are more likely to to flag a random comment drawn from Reddit, Twitter, or 4chan as toxic, as are people who have personally experienced harassment in the past. Based on our findings, we show how current one-size-fits-all toxicity classification algorithms, like the Perspective API from Jigsaw, can improve in accuracy by 86% on average through personalized model tuning. Ultimately, we highlight current pitfalls and new design directions that can improve the equity and efficacy of toxic content classifiers for all users.

قيم البحث

اقرأ أيضاً

Despite the recent successes of transformer-based models in terms of effectiveness on a variety of tasks, their decisions often remain opaque to humans. Explanations are particularly important for tasks like offensive language or toxicity detection o n social media because a manual appeal process is often in place to dispute automatically flagged content. In this work, we propose a technique to improve the interpretability of these models, based on a simple and powerful assumption: a post is at least as toxic as its most toxic span. We incorporate this assumption into transformer models by scoring a post based on the maximum toxicity of its spans and augmenting the training process to identify correct spans. We find this approach effective and can produce explanations that exceed the quality of those provided by Logistic Regression analysis (often regarded as a highly-interpretable model), according to a human study.
The coronavirus disease 2019 (COVID-19) pandemic has caused an unprecedented health crisis for the global. Digital contact tracing, as a transmission intervention measure, has shown its effectiveness on pandemic control. Despite intensive research on digital contact tracing, existing solutions can hardly meet users requirements on privacy and convenience. In this paper, we propose BU-Trace, a novel permissionless mobile system for privacy-preserving intelligent contact tracing based on QR code and NFC technologies. First, a user study is conducted to investigate and quantify the user acceptance of a mobile contact tracing system. Second, a decentralized system is proposed to enable contact tracing while protecting user privacy. Third, an intelligent behavior detection algorithm is designed to ease the use of our system. We implement BU-Trace and conduct extensive experiments in several real-world scenarios. The experimental results show that BU-Trace achieves a privacy-preserving and intelligent mobile system for contact tracing without requesting location or other privacy-related permissions.
Due to the outbreak of COVID-19, users are increasingly turning to online services. An increase in social media usage has also been observed, leading to the suspicion that this has also raised cyberbullying. In this initial work, we explore the possi bility of an increase in cyberbullying incidents due to the pandemic and high social media usage. To evaluate this trend, we collected 454,046 cyberbullying-related public tweets posted between January 1st, 2020 -- June 7th, 2020. We summarize the tweets containing multiple keywords into their daily counts. Our analysis showed the existence of at most one statistically significant changepoint for most of these keywords, which were primarily located around the end of March. Almost all these changepoint time-locations can be attributed to COVID-19, which substantiates our initial hypothesis of an increase in cyberbullying through analysis of discussions over Twitter.
In online social networks (OSN), users quite usually disclose sensitive information about themselves by publishing messages. At the same time, they are (in many cases) unable to properly manage the access to this sensitive information due to the foll owing issues: i) the rigidness of the access control mechanism implemented by the OSN, and ii) many users lack of technical knowledge about data privacy and access control. To tackle these limitations, in this paper, we propose a dynamic, transparent and privacy-driven access control mechanism for textual messages published in OSNs. The notion of privacy-driven is achieved by analyzing the semantics of the messages to be published and, according to that, assessing the degree of sensitiveness of their contents. For this purpose, the proposed system relies on an automatic semantic annotation mechanism that, by using knowledge bases and linguistic tools, is able to associate a meaning to the information to be published. By means of this annotation, our mechanism automatically detects the information that is sensitive according to the privacy requirements of the publisher of data, with regard to the type of reader that may access such data. Finally, our access control mechanism automatically creates sanitiz
Widespread Chinese social media applications such as Weibo are widely known for monitoring and deleting posts to conform to Chinese government requirements. In this paper, we focus on analyzing a dataset of censored and uncensored posts in Weibo. Des pite previous work that only considers text content of posts, we take a multi-modal approach that takes into account both text and image content. We categorize this dataset into 14 categories that have the potential to be censored on Weibo, and seek to quantify censorship by topic. Specifically, we investigate how different factors interact to affect censorship. We also investigate how consistently and how quickly different topics are censored. To this end, we have assembled an image dataset with 18,966 images, as well as a text dataset with 994 posts from 14 categories. We then utilized deep learning, CNN localization, and NLP techniques to analyze the target dataset and extract categories, for further analysis to better understand censorship mechanisms in Weibo. We found that sentiment is the only indicator of censorship that is consistent across the variety of topics we identified. Our finding matches with recently leaked logs from Sina Weibo. We also discovered that most categories like those related to anti-government actions (e.g. protest) or categories related to politicians (e.g. Xi Jinping) are often censored, whereas some categories such as crisis-related categories (e.g. rainstorm) are less frequently censored. We also found that censored posts across all categories are deleted in three hours on average.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا