من المعروف أن فك التشفير المباشر التجريدي للحوار الموجه في المهام يعاني من التأثير الشرح بعيدا، حيث يتجلى في النماذج التي تفضل الردود القصيرة والأعمالية.نحن هنا نقول لاستخدام نظرية بايز لتصدي مهمة الحوار إلى طرازتين، وتوزيع السياق بالنظر إلى الاستجابة، وقبل الاستجابة نفسها.هذا النهج، وهو مثيل لنموذج القناة الصاخبة، كلاهما يخفف من تفسير التأثير ويسمح بتدمير النماذج الكبيرة المحددة مسبقا للاستجابة السابقة.نقدم تجارب مكثفة تظهر أن نموذج قناة صاخبة يرمز أفضل ردود أفضل مقارنة بالفهرات المباشرة وأن استراتيجية الاحتجاط بمقدار مرحلتين، تستخدم بيانات الحوار المفتوحة الموجهة نحو المهام، وتحسين النماذج ذات التهيئة بشكل عشوائي.
Abstract Direct decoding for task-oriented dialogue is known to suffer from the explaining-away effect, manifested in models that prefer short and generic responses. Here we argue for the use of Bayes' theorem to factorize the dialogue task into two models, the distribution of the context given the response, and the prior for the response itself. This approach, an instantiation of the noisy channel model, both mitigates the explaining-away effect and allows the principled incorporation of large pretrained models for the response prior. We present extensive experiments showing that a noisy channel model decodes better responses compared to direct decoding and that a two-stage pretraining strategy, employing both open-domain and task-oriented dialogue data, improves over randomly initialized models.
References used
https://aclanthology.org/
Recent years has witnessed the remarkable success in end-to-end task-oriented dialog system, especially when incorporating external knowledge information. However, the quality of most existing models' generated response is still limited, mainly due t
Neural approaches to natural language generation in task-oriented dialogue have typically required large amounts of annotated training data to achieve satisfactory performance, especially when generating from compositional inputs. To address this iss
Incorporating knowledge bases (KB) into end-to-end task-oriented dialogue systems is challenging, since it requires to properly represent the entity of KB, which is associated with its KB context and dialogue context. The existing works represent the
Continual learning in task-oriented dialogue systems allows the system to add new domains and functionalities overtime after deployment, without incurring the high cost of retraining the whole system each time. In this paper, we propose a first-ever
Dialogue policy optimisation via reinforcement learning requires a large number of training interactions, which makes learning with real users time consuming and expensive. Many set-ups therefore rely on a user simulator instead of humans. These user