Cross Copy Network for Dialogue Generation

59 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Changzhen Ji

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Changzhen Ji - Xin Zhou - Yating Zhang

الحساب واللغة

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

In the past few years, audiences from different fields witness the achievements of sequence-to-sequence models (e.g., LSTM+attention, Pointer Generator Networks, and Transformer) to enhance dialogue content generation. While content fluency and accuracy often serve as the major indicators for model training, dialogue logics, carrying critical information for some particular domains, are often ignored. Take customer service and court debate dialogue as examples, compatible logics can be observed across different dialogue instances, and this information can provide vital evidence for utterance generation. In this paper, we propose a novel network architecture - Cross Copy Networks(CCN) to explore the current dialog context and similar dialogue instances logical structure simultaneously. Experiments with two tasks, court debate and customer service content generation, proved that the proposed algorithm is superior to existing state-of-art content generation models.

قيم البحث

214 - Bo-Hsiang Tseng , Florian Kreyssig , Pawel Budzianowski 2018

Cross-domain natural language generation (NLG) is still a difficult task within spoken dialogue modelling. Given a semantic representation provided by the dialogue manager, the language generator should generate sentences that convey desired informat ion. Traditional template-based generators can produce sentences with all necessary information, but these sentences are not sufficiently diverse. With RNN-based models, the diversity of the generated sentences can be high, however, in the process some information is lost. In this work, we improve an RNN-based generator by considering latent information at the sentence level during generation using the conditional variational autoencoder architecture. We demonstrate that our model outperforms the original RNN-based generator, while yielding highly diverse sentences. In addition, our model performs better when the training data is limited.

الحساب واللغة الذكاء الاصطناعي

Data Augmentation for Copy-Mechanism in Dialogue State Tracking

272 - Xiaohui Song , Liangjun Zang , Yipeng Su 2020

While several state-of-the-art approaches to dialogue state tracking (DST) have shown promising performances on several benchmarks, there is still a significant performance gap between seen slot values (i.e., values that occur in both training set an d test set) and unseen ones (values that occur in training set but not in test set). Recently, the copy-mechanism has been widely used in DST models to handle unseen slot values, which copies slot values from user utterance directly. In this paper, we aim to find out the factors that influence the generalization ability of a common copy-mechanism model for DST. Our key observations include: 1) the copy-mechanism tends to memorize values rather than infer them from contexts, which is the primary reason for unsatisfactory generalization performance; 2) greater diversity of slot values in the training set increase the performance on unseen values but slightly decrease the performance on seen values. Moreover, we propose a simple but effective algorithm of data augmentation to train copy-mechanism models, which augments the input dataset by copying user utterances and replacing the real slot values with randomly generated strings. Users could use two hyper-parameters to realize a trade-off between the performances on seen values and unseen ones, as well as a trade-off between overall performance and computational cost. Experimental results on three widely used datasets (WoZ 2.0, DSTC2, and Multi-WoZ 2.0) show the effectiveness of our approach.

الحساب واللغة

Context-Sensitive Generation Network for Handing Unknown Slot Values in Dialogue State Tracking

66 - Puhai Yang , Heyan Huang , 2020

As a key component in a dialogue system, dialogue state tracking plays an important role. It is very important for dialogue state tracking to deal with the problem of unknown slot values. As far as we known, almost all existing approaches depend on p ointer network to solve the unknown slot value problem. These pointer network-based methods usually have a hidden assumption that there is at most one out-of-vocabulary word in an unknown slot value because of the character of a pointer network. However, often, there are multiple out-of-vocabulary words in an unknown slot value, and it makes the existing methods perform bad. To tackle the problem, in this paper, we propose a novel Context-Sensitive Generation network (CSG) which can facilitate the representation of out-of-vocabulary words when generating the unknown slot value. Extensive experiments show that our proposed method performs better than the state-of-the-art baselines.

الحساب واللغة

Adaptive Parameterization for Neural Dialogue Generation

132 - Hengyi Cai , Hongshen Chen , Cheng Zhang 2020

Neural conversation systems generate responses based on the sequence-to-sequence (SEQ2SEQ) paradigm. Typically, the model is equipped with a single set of learned parameters to generate responses for given input contexts. When confronting diverse con versations, its adaptability is rather limited and the model is hence prone to generate generic responses. In this work, we propose an {bf Ada}ptive {bf N}eural {bf D}ialogue generation model, textsc{AdaND}, which manages various conversations with conversation-specific parameterization. For each conversation, the model generates parameters of the encoder-decoder by referring to the input context. In particular, we propose two adaptive parameterization mechanisms: a context-aware and a topic-aware parameterization mechanism. The context-aware parameterization directly generates the parameters by capturing local semantics of the given context. The topic-aware parameterization enables parameter sharing among conversations with similar topics by first inferring the latent topics of the given context and then generating the parameters with respect to the distributional topics. Extensive experiments conducted on a large-scale real-world conversational dataset show that our model achieves superior performance in terms of both quantitative metrics and human evaluations.

الحساب واللغة استرجاع المعلومات التعلم الآلي

Negative Training for Neural Dialogue Response Generation

325 - Tianxing He , James Glass 2019

Although deep learning models have brought tremendous advancements to the field of open-domain dialogue response generation, recent research results have revealed that the trained models have undesirable generation behaviors, such as malicious respon ses and generic (boring) responses. In this work, we propose a framework named Negative Training to minimize such behaviors. Given a trained model, the framework will first find generated samples that exhibit the undesirable behavior, and then use them to feed negative training signals for fine-tuning the model. Our experiments show that negative training can significantly reduce the hit rate of malicious responses, or discourage frequent responses and improve response diversity.

الحساب واللغة التعلم الآلي

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الإتحاد الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Cross Copy Network for Dialogue Generation

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً