تحذير: تحتوي هذه الورقة على محتوى قد يكون مسيء أو مزعجا.تستخدم قواعد المعرفة المنطقية (CSKB) بشكل متزايد لمختلف مهام معالجة اللغة الطبيعية.نظرا لأن CSCBS هي في الغالب التي تم إنشاؤها في الغالب وقد تعكس التحيزات المجتمعية، من المهم التأكد من عدم الخلط بين هذه التحيزات بمفهوم المنطقية.نحن هنا نركز على اثنين من CSCBS واستخدامه على نطاق واسع، والفصح والنهاهي والنهاهي، وتأسيس وجود التحيز في شكل نوعين من الأضرار التمثيلية، والانتعاش في التصورات الاستقطابية وتفاوت التمثيل في مختلف المجموعات الديموغرافية في كلا CSCBS.بعد ذلك، نجد أضرارا تمثيلية مماثلة للنماذج المصب التي تستخدم المفاهيم.أخيرا، نقترح نهجا قائم على الترشيح لتخفيف هذه الأضرار، ويلاحظ أن نهجنا المستندات المرتبطا يمكن أن يقلل من المشكلات في كل من الموارد والنماذج ولكن يؤدي إلى انخفاض الأداء، مغادرة المجال للعمل في المستقبل لبناء نماذج المنطقية أكثر عدالة وأقوىوبعد
Warning: this paper contains content that may be offensive or upsetting. Commonsense knowledge bases (CSKB) are increasingly used for various natural language processing tasks. Since CSKBs are mostly human-generated and may reflect societal biases, it is important to ensure that such biases are not conflated with the notion of commonsense. Here we focus on two widely used CSKBs, ConceptNet and GenericsKB, and establish the presence of bias in the form of two types of representational harms, overgeneralization of polarized perceptions and representation disparity across different demographic groups in both CSKBs. Next, we find similar representational harms for downstream models that use ConceptNet. Finally, we propose a filtering-based approach for mitigating such harms, and observe that our filtered-based approach can reduce the issues in both resources and models but leads to a performance drop, leaving room for future work to build fairer and stronger commonsense models.
References used
https://aclanthology.org/
Codifying commonsense knowledge in machines is a longstanding goal of artificial intelligence. Recently, much progress toward this goal has been made with automatic knowledge base (KB) construction techniques. However, such techniques focus primarily
A hyperbole is an intentional and creative exaggeration not to be taken literally. Despite its ubiquity in daily life, the computational explorations of hyperboles are scarce. In this paper, we tackle the under-explored and challenging task: sentence
Abstract To develop commonsense-grounded NLP applications, a comprehensive and accurate commonsense knowledge graph (CKG) is needed. It is time-consuming to manually construct CKGs and many research efforts have been devoted to the automatic construc
In this work we leverage commonsense knowledge in form of knowledge paths to establish connections between sentences, as a form of explicitation of implicit knowledge. Such connections can be direct (singlehop paths) or require intermediate concepts
Large scale pretrained models have demonstrated strong performances on several natural language generation and understanding benchmarks. However, introducing commonsense into them to generate more realistic text remains a challenge. Inspired from pre