الخلل من الطبقة هو تحد مشترك في العديد من مهام NLP، ولديه اتصالات واضحة إلى التحيز، في هذا التحيز في البيانات التدريبية يؤدي غالبا إلى دقة أعلى بالنسبة لمجموعات الأغلبية على حساب مجموعات الأقليات.ومع ذلك، كان هناك تقليديا قطع اتصال بين البحث في التعلم المتوازن في الفئة والتخفيف من التحيز، ولديه مؤخرا فقط تم النظر في اثنين من خلال عدسة مشتركة.في هذا العمل، نقيم أساليب التعلم الطويلة ذات الذيل الطويل لتغريد المعنويات وتصنيف الاحتلال، وتوسيع نهج قائم على الهامش مع طرق لفرض الإنصاف.نعرض تجريبيا من خلال تجارب محكومة أن الأساليب المقترحة تساعد في تخفيف كل من الخلل في الطبقة والتحيزات الديموغرافية.
Class imbalance is a common challenge in many NLP tasks, and has clear connections to bias, in that bias in training data often leads to higher accuracy for majority groups at the expense of minority groups. However there has traditionally been a disconnect between research on class-imbalanced learning and mitigating bias, and only recently have the two been looked at through a common lens. In this work we evaluate long-tail learning methods for tweet sentiment and occupation classification, and extend a margin-loss based approach with methods to enforce fairness. We empirically show through controlled experiments that the proposed approaches help mitigate both class imbalance and demographic biases.
References used
https://aclanthology.org/
Current abusive language detection systems have demonstrated unintended bias towards sensitive features such as nationality or gender. This is a crucial issue, which may harm minorities and underrepresented groups if such systems were integrated in r
Tables provide valuable knowledge that can be used to verify textual statements. While a number of works have considered table-based fact verification, direct alignments of tabular data with tokens in textual statements are rarely available. Moreover
Meta-learning has achieved great success in leveraging the historical learned knowledge to facilitate the learning process of the new task. However, merely learning the knowledge from the historical tasks, adopted by current meta-learning algorithms,
While the predictive performance of modern statistical dependency parsers relies heavily on the availability of expensive expert-annotated treebank data, not all annotations contribute equally to the training of the parsers. In this paper, we attempt
Unlike well-structured text, such as news reports and encyclopedia articles, dialogue content often comes from two or more interlocutors, exchanging information with each other. In such a scenario, the topic of a conversation can vary upon progressio