يمكن اكتشاف الوظائف الإخبارية الخادعة المشتركة في المجتمعات عبر الإنترنت مع نماذج NLP، وقد ركزت البحوث الحديثة الكثير على تطوير هذه النماذج.في هذا العمل، نستخدم خصائص المجتمعات والمؤلفين عبر الإنترنت --- سياق كيفية نشر المحتوى - - لشرح أداء نموذج كشف الخداع الشبكي العصبي وتحديد السكان الفرعيين الذين يتأثرون بشكل غير متناسب بدقة نموذجيةأو الفشل.نحن ندرس من يقوم بنشر المحتوى، وحيث يتم نشر المحتوى إليه.نجد أنه في حين أن خصائص المؤلف هي أفضل من المتنبئين من المحتوى الخادع من الخصائص المجتمعية، فإن كلا الخصائص مرتبطة بقوة بأداء نموذجي.قد تفشل مقاييس الأداء التقليدية مثل درجة F1 في التقاط أداء نموذجي ضعيف على السكان الفرعيين المعزولين مثل المؤلفين المحددين، وعلى هذا النحو، فإن التقييم الأكثر دقة لنماذج الكشف عن الخداع أمر بالغ الأهمية.
Deceptive news posts shared in online communities can be detected with NLP models, and much recent research has focused on the development of such models. In this work, we use characteristics of online communities and authors --- the context of how and where content is posted --- to explain the performance of a neural network deception detection model and identify sub-populations who are disproportionately affected by model accuracy or failure. We examine who is posting the content, and where the content is posted to. We find that while author characteristics are better predictors of deceptive content than community characteristics, both characteristics are strongly correlated with model performance. Traditional performance metrics such as F1 score may fail to capture poor model performance on isolated sub-populations such as specific authors, and as such, more nuanced evaluation of deception detection models is critical.
References used
https://aclanthology.org/
With the increasing use of machine-learning driven algorithmic judgements, it is critical to develop models that are robust to evolving or manipulated inputs. We propose an extensive analysis of model robustness against linguistic variation in the se
Large-scale language models such as GPT-3 are excellent few-shot learners, allowing them to be controlled via natural text prompts. Recent studies report that prompt-based direct classification eliminates the need for fine-tuning but lacks data and i
Identifying emotions from text is crucial for a variety of real world tasks. We consider the two largest now-available corpora for emotion classification: GoEmotions, with 58k messages labelled by readers, and Vent, with 33M writer-labelled messages.
This paper describes our approach (IIITH) for SemEval-2021 Task 5: HaHackathon: Detecting and Rating Humor and Offense. Our results focus on two major objectives: (i) Effect of task adaptive pretraining on the performance of transformer based models
Learning a good latent representation is essential for text style transfer, which generates a new sentence by changing the attributes of a given sentence while preserving its content. Most previous works adopt disentangled latent representation learn