Why Do Neural Response Generation Models Prefer Universal Replies?

439 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Bowen Wu

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Bowen Wu - Nan Jiang - Zhifeng Gao

الحساب واللغة

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Recent advances in sequence-to-sequence learning reveal a purely data-driven approach to the response generation task. Despite its diverse applications, existing neural models are prone to producing short and generic replies, making it infeasible to tackle open-domain challenges. In this research, we analyze this critical issue in light of the models optimization goal and the specific characteristics of the human-to-human dialog corpus. By decomposing the black box into parts, a detailed analysis of the probability limit was conducted to reveal the reason behind these universal replies. Based on these analyses, we propose a max-margin ranking regularization term to avoid the models leaning to these replies. Finally, empirical experiments on case studies and benchmarks with several metrics validate this approach.

قيم البحث

148 - Bolin Wei , Shuai Lu , Lili Mou 2017

This paper addresses the question: Why do neural dialog systems generate short and meaningless replies? We conjecture that, in a dialog system, an utterance may have multiple equally plausible replies, causing the deficiency of neural networks in the dialog application. We propose a systematic way to mimic the dialog scenario in a machine translation system, and manage to reproduce the phenomenon of generating short and less meaningful sentences in the translation setting, showing evidence of our conjecture.

الحساب واللغة التعلم الآلي

Language Scaling for Universal Suggested Replies Model

98 - Qianlan Ying , Payal Bajaj , Budhaditya Deb 2021

We consider the problem of scaling automated suggested replies for Outlook email system to multiple languages. Faced with increased compute requirements and low resources for language expansion, we build a single universal model for improving the qua lity and reducing run-time costs of our production system. However, restricted data movement across regional centers prevents joint training across languages. To this end, we propose a multi-task continual learning framework, with auxiliary tasks and language adapters to learn universal language representation across regions. The experimental results show positive cross-lingual transfer across languages while reducing catastrophic forgetting across regions. Our online results on real user traffic show significant gains in CTR and characters saved, as well as 65% training cost reduction compared with per-language models. As a consequence, we have scaled the feature in multiple languages including low-resource markets.

الحساب واللغة الذكاء الاصطناعي

Negative Training for Neural Dialogue Response Generation

325 - Tianxing He , James Glass 2019

Although deep learning models have brought tremendous advancements to the field of open-domain dialogue response generation, recent research results have revealed that the trained models have undesirable generation behaviors, such as malicious respon ses and generic (boring) responses. In this work, we propose a framework named Negative Training to minimize such behaviors. Given a trained model, the framework will first find generated samples that exhibit the undesirable behavior, and then use them to feed negative training signals for fine-tuning the model. Our experiments show that negative training can significantly reduce the hit rate of malicious responses, or discourage frequent responses and improve response diversity.

الحساب واللغة التعلم الآلي

Contextual Parameter Generation for Universal Neural Machine Translation

201 - Emmanouil Antonios Platanios , Mrinmaya Sachan , Graham Neubig andn Tom Mitchell 2018

We propose a simple modification to existing neural machine translation (NMT) models that enables using a single universal model to translate between multiple languages while allowing for language specific parameterization, and that can also be used for domain adaptation. Our approach requires no changes to the model architecture of a standard NMT system, but instead introduces a new component, the contextual parameter generator (CPG), that generates the parameters of the system (e.g., weights in a neural network). This parameter generator accepts source and target language embeddings as input, and generates the parameters for the encoder and the decoder, respectively. The rest of the model remains unchanged and is shared across all languages. We show how this simple modification enables the system to use monolingual data for training and also perform zero-shot translation. We further show it is able to surpass state-of-the-art performance for both the IWSLT-15 and IWSLT-17 datasets and that the learned language embeddings are able to uncover interesting relationships between languages.

الحساب واللغة التعلم الآلي التعلم الالي

Do Neural Language Models Show Preferences for Syntactic Formalisms?

92 - Artur Kulmizev , Vinit Ravishankar , Mostafa Abdou 2020

Recent work on the interpretability of deep neural language models has concluded that many properties of natural language syntax are encoded in their representational spaces. However, such studies often suffer from limited scope by focusing on a sing le language and a single linguistic formalism. In this study, we aim to investigate the extent to which the semblance of syntactic structure captured by language models adheres to a surface-syntactic or deep syntactic style of analysis, and whether the patterns are consistent across different languages. We apply a probe for extracting directed dependency trees to BERT and ELMo models trained on 13 different languages, probing for two different syntactic annotation styles: Universal Dependencies (UD), prioritizing deep syntactic relations, and Surface-Syntactic Universal Dependencies (SUD), focusing on surface structure. We find that both models exhibit a preference for UD over SUD - with interesting variations across languages and layers - and that the strength of this preference is correlated with differences in tree shape.

الحساب واللغة التعلم الآلي

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة وهران احمد بن بله

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Why Do Neural Response Generation Models Prefer Universal Replies?

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً