على الرغم من أن الشبكات العصبية العميقة تعمل على نطاق واسع وأثبت فعاليتها في مهام تحليل المعنويات، إلا أنها تظل تحديا للمطورين النموذجيين لتقييم نماذجهم من أجل التنبؤات الخاطئة التي قد تكون موجودة قبل النشر.بمجرد النشر، يمكن أن يكون من الصعب تحديد الأخطاء الطارئة في وقت التشغيل التنبؤ ومستحيل تتبع مصادرها.لمعالجة هذه الثغرات، في هذه الورقة نقترح إطار اكتشاف خطأ لتحليل المعرفات بناء على ميزات تفسير.نحن نؤدي التحقق من صحة ميزة المستوى العالمي مع تقييم الإنسان في حلقة، تليها تكامل تحليل المساهمة العالمية والمستوى المحلي.تظهر النتائج التجريبية أنه نظرا للتدخل المحدود للإنسان في الحلقة، فإن طريقتنا قادرة على تحديد تنبؤات النموذج الخاطئة على البيانات غير المرئية بدقة عالية.
Although deep neural networks have been widely employed and proven effective in sentiment analysis tasks, it remains challenging for model developers to assess their models for erroneous predictions that might exist prior to deployment. Once deployed, emergent errors can be hard to identify in prediction run-time and impossible to trace back to their sources. To address such gaps, in this paper we propose an error detection framework for sentiment analysis based on explainable features. We perform global-level feature validation with human-in-the-loop assessment, followed by an integration of global and local-level feature contribution analysis. Experimental results show that, given limited human-in-the-loop intervention, our method is able to identify erroneous model predictions on unseen data with high precision.
References used
https://aclanthology.org/
The development of neural networks and pretraining techniques has spawned many sentence-level tagging systems that achieved superior performance on typical benchmarks. However, a relatively less discussed topic is what if more context information is
Abstract The metrics standardly used to evaluate Natural Language Generation (NLG) models, such as BLEU or METEOR, fail to provide information on which linguistic factors impact performance. Focusing on Surface Realization (SR), the task of convertin
This paper investigates how to correct Chinese text errors with types of mistaken, missing and redundant characters, which are common for Chinese native speakers. Most existing models based on detect-correct framework can correct mistaken characters,
Aspect-based sentiment analysis (ABSA) task consists of three typical subtasks: aspect term extraction, opinion term extraction, and sentiment polarity classification. These three subtasks are usually performed jointly to save resources and reduce th
The prominence of figurative language devices, such as sarcasm and irony, poses serious challenges for Arabic Sentiment Analysis (SA). While previous research works tackle SA and sarcasm detection separately, this paper introduces an end-to-end deep