Improving Robustness of Task Oriented Dialog Systems

123 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Arash Einolghozati

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Arash Einolghozati - Sonal Gupta - Mrinal Mohit

الحساب واللغة الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Task oriented language understanding in dialog systems is often modeled using intents (task of a query) and slots (parameters for that task). Intent detection and slot tagging are, in turn, modeled using sentence classification and word tagging techniques respectively. Similar to adversarial attack problems with computer vision models discussed in existing literature, these intent-slot tagging models are often over-sensitive to small variations in input -- predicting different and often incorrect labels when small changes are made to a query, thus reducing their accuracy and reliability. However, evaluating a models robustness to these changes is harder for language since words are discrete and an automated change (e.g. adding `noise) to a query sometimes changes the meaning and thus labels of a query. In this paper, we first describe how to create an adversarial test set to measure the robustness of these models. Furthermore, we introduce and adapt adversarial training methods as well as data augmentation using back-translation to mitigate these issues. Our experiments show that both techniques improve the robustness of the system substantially and can be combined to yield the best results.

قيم البحث

118 - Arash Einolghozati , Panupong Pasupat , Sonal Gupta 2019

Semantic parsing using hierarchical representations has recently been proposed for task oriented dialog with promising results [Gupta et al 2018]. In this paper, we present three different improvements to the model: contextualized embeddings, ensembl ing, and pairwise re-ranking based on a language model. We taxonomize the errors possible for the hierarchical representation, such as wrong top intent, missing spans or split spans, and show that the three approaches correct different kinds of errors. The best model combines the three techniques and gives 6.4% better exact match accuracy than the state-of-the-art, with an error reduction of 33%, resulting in a new state-of-the-art result on the Task Oriented Parsing (TOP) dataset.

الحساب واللغة الذكاء الاصطناعي

Conversation Learner -- A Machine Teaching Tool for Building Dialog Managers for Task-Oriented Dialog Systems

78 - Swadheen Shukla , Lars Liden , Shahin Shayandeh 2020

Traditionally, industry solutions for building a task-oriented dialog system have relied on helping dialog authors define rule-based dialog managers, represented as dialog flows. While dialog flows are intuitively interpretable and good for simple sc enarios, they fall short of performance in terms of the flexibility needed to handle complex dialogs. On the other hand, purely machine-learned models can handle complex dialogs, but they are considered to be black boxes and require large amounts of training data. In this demonstration, we showcase Conversation Learner, a machine teaching tool for building dialog managers. It combines the best of both approaches by enabling dialog authors to create a dialog flow using familiar tools, converting the dialog flow into a parametric model (e.g., neural networks), and allowing dialog authors to improve the dialog manager (i.e., the parametric model) over time by leveraging user-system dialog logs as training data through a machine teaching interface.

الحساب واللغة الذكاء الاصطناعي

A Hybrid Task-Oriented Dialog System with Domain and Task Adaptive Pretraining

251 - Boliang Zhang , Ying Lyu , Ning Ding 2021

This paper describes our submission for the End-to-end Multi-domain Task Completion Dialog shared task at the 9th Dialog System Technology Challenge (DSTC-9). Participants in the shared task build an end-to-end task completion dialog system which is evaluated by human evaluation and a user simulator based automatic evaluation. Different from traditional pipelined approaches where modules are optimized individually and suffer from cascading failure, we propose an end-to-end dialog system that 1) uses Generative Pretraining 2 (GPT-2) as the backbone to jointly solve Natural Language Understanding, Dialog State Tracking, and Natural Language Generation tasks, 2) adopts Domain and Task Adaptive Pretraining to tailor GPT-2 to the dialog domain before finetuning, 3) utilizes heuristic pre/post-processing rules that greatly simplify the prediction tasks and improve generalizability, and 4) equips a fault tolerance module to correct errors and inappropriate responses. Our proposed method significantly outperforms baselines and ties for first place in the official evaluation. We make our source code publicly available.

الحساب واللغة الذكاء الاصطناعي

A Tailored Pre-Training Model for Task-Oriented Dialog Generation

224 - Jing Gu , Qingyang Wu , Chongruo Wu 2020

The recent success of large pre-trained language models such as BERT and GPT-2 has suggested the effectiveness of incorporating language priors in downstream dialog generation tasks. However, the performance of pre-trained models on the dialog task i s not as optimal as expected. In this paper, we propose a Pre-trained Role Alternating Language model (PRAL), designed specifically for task-oriented conversational systems. We adopted (Wu et al., 2019) that models two speakers separately. We also design several techniques, such as start position randomization, knowledge distillation, and history discount to improve pre-training performance. We introduce a task-oriented dialog pretraining dataset by cleaning 13 existing data sets. We test PRAL on three different downstream tasks. The results show that PRAL performs better or on par with state-of-the-art methods.

الحساب واللغة الذكاء الاصطناعي

SIMMC 2.0: A Task-oriented Dialog Dataset for Immersive Multimodal Conversations

179 - Satwik Kottur , Seungwhan Moon , Alborz Geramifard 2021

We present a new corpus for the Situated and Interactive Multimodal Conversations, SIMMC 2.0, aimed at building a successful multimodal assistant agent. Specifically, the dataset features 11K task-oriented dialogs (117K utterances) between a user and a virtual assistant on the shopping domain (fashion and furniture), grounded in situated and photo-realistic VR scenes. The dialogs are collected using a two-phase pipeline, which first generates simulated dialog flows via a novel multimodal dialog simulator we propose, followed by manual paraphrasing of the generated utterances. In this paper, we provide an in-depth analysis of the collected dataset, and describe in detail the four main benchmark tasks we propose for SIMMC 2.0. The preliminary analysis with a baseline model highlights the new challenges that the SIMMC 2.0 dataset brings, suggesting new directions for future research. Our dataset and code will be made publicly available.

الحساب واللغة الذكاء الاصطناعي