من الصعب تقييم الردود الناتجة عن نماذج المحادثة العصبية (NCMS) لأنظمة غير موجهة نحو المهام.نقترح أزواج استجابة مركبة (CRPs) لتقييم الردود تلقائيا من NCMS غير المهام.أجرينا تحليل خطأ على الردود التي تم إنشاؤها بواسطة الشبكة العصبية النموذجية النموذجية المتكررة للمفوضات (RNN) وإنشاء ثلاثة أنواع من CRPs المقابلة للأخطاء الثلاثة الأكثر شيوعا الموجودة في التحليل.تم تقييم ثلاثة NCMM من جودة الاستجابة المختلفة بموضوعية مع CRPS ومقارنة بتقييم شخصي.كانت صحة تم الحصول عليها بواسطة الأنواع الثلاثة من CRPs تتفق مع نتائج التقييم الذاتي.
Responses generated by neural conversational models (NCMs) for non-task-oriented systems are difficult to evaluate. We propose contrastive response pairs (CRPs) for automatically evaluating responses from non-task-oriented NCMs. We conducted an error analysis on responses generated by an encoder-decoder recurrent neural network (RNN) type NCM and created three types of CRPs corresponding to the three most frequent errors found in the analysis. Three NCMs of different response quality were objectively evaluated with the CRPs and compared to a subjective assessment. The correctness obtained by the three types of CRPs were consistent with the results of the subjective assessment.
References used
https://aclanthology.org/
This paper aims at providing a comprehensive overview of recent developments in dialogue state tracking (DST) for task-oriented conversational systems. We introduce the task, the main datasets that have been exploited as well as their evaluation metr
Saliency methods are widely used to interpret neural network predictions, but different variants of saliency methods often disagree even on the interpretations of the same prediction made by the same model. In these cases, how do we identify when are
This paper presents an automatic method to evaluate the naturalness of natural language generation in dialogue systems. While this task was previously rendered through expensive and time-consuming human labor, we present this novel task of automatic
Continual learning in task-oriented dialogue systems allows the system to add new domains and functionalities overtime after deployment, without incurring the high cost of retraining the whole system each time. In this paper, we propose a first-ever
Explaining neural network models is important for increasing their trustworthiness in real-world applications. Most existing methods generate post-hoc explanations for neural network models by identifying individual feature attributions or detecting