Improving Long Distance Slot Carryover in Spoken Dialogue Systems

107 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Chetan Naik

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Tongfei Chen - Chetan Naik - Hua He

الحساب واللغة

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Tracking the state of the conversation is a central component in task-oriented spoken dialogue systems. One such approach for tracking the dialogue state is slot carryover, where a model makes a binary decision if a slot from the context is relevant to the current turn. Previous work on the slot carryover task used models that made independent decisions for each slot. A close analysis of the results show that this approach results in poor performance over longer context dialogues. In this paper, we propose to jointly model the slots. We propose two neural network architectures, one based on pointer networks that incorporate slot ordering information, and the other based on transformer networks that uses self attention mechanism to model the slot interdependencies. Our experiments on an internal dialogue benchmark dataset and on the public DSTC2 dataset demonstrate that our proposed models are able to resolve longer distance slot references and are able to achieve competitive performance.

قيم البحث

270 - Rylan Conway , Lambert Mathias 2019

In a spoken dialogue system, dialogue state tracker (DST) components track the state of the conversation by updating a distribution of values associated with each of the slots being tracked for the current user turn, using the interactions until then . Much of the previous work has relied on modeling the natural order of the conversation, using distance based offsets as an approximation of time. In this work, we hypothesize that leveraging the wall-clock temporal difference between turns is crucial for finer-grained control of dialogue scenarios. We develop a novel approach that applies a {it time mask}, based on the wall-clock time difference, to the associated slot embeddings and empirically demonstrate that our proposed approach outperforms existing approaches that leverage distance offsets, on both an internal benchmark dataset as well as DSTC2.

الحساب واللغة

Scalable language model adaptation for spoken dialogue systems

88 - Ankur Gandhe , Ariya Rastrow , Bjorn Hoffmeister 2018

Language models (LM) for interactive speech recognition systems are trained on large amounts of data and the model parameters are optimized on past user data. New application intents and interaction types are released for these systems over time, imp osing challenges to adapt the LMs since the existing training data is no longer sufficient to model the future user interactions. It is unclear how to adapt LMs to new application intents without degrading the performance on existing applications. In this paper, we propose a solution to (a) estimate n-gram counts directly from the hand-written grammar for training LMs and (b) use constrained optimization to optimize the system parameters for future use cases, while not degrading the performance on past usage. We evaluated our approach on new applications intents for a personal assistant system and find that the adaptation improves the word error rate by up to 15% on new applications even when there is no adaptation data available for an application.

الحساب واللغة

Nearly Zero-Shot Learning for Semantic Decoding in Spoken Dialogue Systems

152 - Lina M.Rojas-Barahona , Stefan Ultes , Pawel Budzianowski 2018

This paper presents two ways of dealing with scarce data in semantic decoding using N-Best speech recognition hypotheses. First, we learn features by using a deep learning architecture in which the weights for the unknown and known categories are joi ntly optimised. Second, an unsupervised method is used for further tuning the weights. Sharing weights injects prior knowledge to unknown categories. The unsupervised tuning (i.e. the risk minimisation) improves the F-Measure when recognising nearly zero-shot data on the DSTC3 corpus. This unsupervised method can be applied subject to two assumptions: the rank of the class marginal is assumed to be known and the class-conditional scores of the classifier are assumed to follow a Gaussian distribution.

الحساب واللغة الذكاء الاصطناعي

Variational Cross-domain Natural Language Generation for Spoken Dialogue Systems

214 - Bo-Hsiang Tseng , Florian Kreyssig , Pawel Budzianowski 2018

Cross-domain natural language generation (NLG) is still a difficult task within spoken dialogue modelling. Given a semantic representation provided by the dialogue manager, the language generator should generate sentences that convey desired informat ion. Traditional template-based generators can produce sentences with all necessary information, but these sentences are not sufficiently diverse. With RNN-based models, the diversity of the generated sentences can be high, however, in the process some information is lost. In this work, we improve an RNN-based generator by considering latent information at the sentence level during generation using the conditional variational autoencoder architecture. We demonstrate that our model outperforms the original RNN-based generator, while yielding highly diverse sentences. In addition, our model performs better when the training data is limited.

الحساب واللغة الذكاء الاصطناعي

Reward-Balancing for Statistical Spoken Dialogue Systems using Multi-objective Reinforcement Learning

143 - Stefan Ultes , Pawe{l} Budzianowski , I~nigo Casanueva 2017

Reinforcement learning is widely used for dialogue policy optimization where the reward function often consists of more than one component, e.g., the dialogue success and the dialogue length. In this work, we propose a structured method for finding a good balance between these components by searching for the optimal reward component weighting. To render this search feasible, we use multi-objective reinforcement learning to significantly reduce the number of training dialogues required. We apply our proposed method to find optimized component weights for six domains and compare them to a default baseline.

الحساب واللغة التعلم الالي

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة سوهاج

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Improving Long Distance Slot Carryover in Spoken Dialogue Systems

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً