Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Knowledge Enhanced Fine-Tuning for Better Handling Unseen Entities in Dialogue Generation

المعرفة عززت الطاقة الجميلة للتعامل بشكل أفضل كيانات غير مرئية في توليد الحوار

240 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

handling unseen entities knowledge enhanced fine-tuning handling unseen التعامل مع الكيانات غير المرئية المعرفة تعزيز ضبط الركود التعامل مع غير مرئي صناعة حمض الفوسفور

visit our facebook page

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Although pre-training models have achieved great success in dialogue generation, their performance drops dramatically when the input contains an entity that does not appear in pre-training and fine-tuning datasets (unseen entity). To address this issue, existing methods leverage an external knowledge base to generate appropriate responses. In real-world practical, the entity may not be included by the knowledge base or suffer from the precision of knowledge retrieval. To deal with this problem, instead of introducing knowledge base as the input, we force the model to learn a better semantic representation by predicting the information in the knowledge base, only based on the input context. Specifically, with the help of a knowledge base, we introduce two auxiliary training objectives: 1) Interpret Masked Word, which conjectures the meaning of the masked entity given the context; 2) Hypernym Generation, which predicts the hypernym of the entity based on the context. Experiment results on two dialogue corpus verify the effectiveness of our methods under both knowledge available and unavailable settings.

References used

https://aclanthology.org/

rate research

More is Better: Enhancing Open-Domain Dialogue Generation via Multi-Source Heterogeneous Knowledge

366 - Association for Computation Linguistics 2021 مقالة

Despite achieving remarkable performance, previous knowledge-enhanced works usually only use a single-source homogeneous knowledge base of limited knowledge coverage. Thus, they often degenerate into traditional methods because not all dialogues can be linked with knowledge entries. This paper proposes a novel dialogue generation model, MSKE-Dialog, to solve this issue with three unique advantages: (1) Rather than only one, MSKE-Dialog can simultaneously leverage multiple heterogeneous knowledge sources (it includes but is not limited to commonsense knowledge facts, text knowledge, infobox knowledge) to improve the knowledge coverage; (2) To avoid the topic conflict among the context and different knowledge sources, we propose a Multi-Reference Selection to better select context/knowledge; (3) We propose a Multi-Reference Generation to generate informative responses by referring to multiple generation references at the same time. Extensive evaluations on a Chinese dataset show the superior performance of this work against various state-of-the-art approaches. To our best knowledge, this work is the first to use the multi-source heterogeneous knowledge in the open-domain knowledge-enhanced dialogue generation.

enhancing open-domain dialogue enhancing open-domain تعزيز الحوار مفتوح تعزيز المجال المفتوح صناعة حمض الفوسفور

Knowledge-Aware Graph-Enhanced GPT-2 for Dialogue State Tracking

729 - Association for Computation Linguistics 2021 مقالة

Dialogue State Tracking is central to multi-domain task-oriented dialogue systems, responsible for extracting information from user utterances. We present a novel hybrid architecture that augments GPT-2 with representations derived from Graph Attenti on Networks in such a way to allow causal, sequential prediction of slot values. The model architecture captures inter-slot relationships and dependencies across domains that otherwise can be lost in sequential prediction. We report improvements in state tracking performance in MultiWOZ 2.0 against a strong GPT-2 baseline and investigate a simplified sparse training scenario in which DST models are trained only on session-level annotations but evaluated at the turn level. We further report detailed analyses to demonstrate the effectiveness of graph models in DST by showing that the proposed graph modules capture inter-slot dependencies and improve the predictions of values that are common to multiple domains.

يبقى حوارات النطاق المفتوح صناعة حمض الفوسفور

CoLV: A Collaborative Latent Variable Model for Knowledge-Grounded Dialogue Generation

471 - Association for Computation Linguistics 2021 مقالة

Knowledge-grounded dialogue generation has achieved promising performance with the engagement of external knowledge sources. Typical approaches towards this task usually perform relatively independent two sub-tasks, i.e., knowledge selection and know ledge-aware response generation. In this paper, in order to improve the diversity of both knowledge selection and knowledge-aware response generation, we propose a collaborative latent variable (CoLV) model to integrate these two aspects simultaneously in separate yet collaborative latent spaces, so as to capture the inherent correlation between knowledge selection and response generation. During generation, our proposed model firstly draws knowledge candidate from the latent space conditioned on the dialogue context, and then samples a response from another collaborative latent space conditioned on both the context and the selected knowledge. Experimental results on two widely-used knowledge-grounded dialogue datasets show that our model outperforms previous methods on both knowledge selection and response generation.

collaborative latent variable collaborative latent knowledge-grounded dialogue generation متغير كامنة تعاونية كامنة تعاونية توليد الحوار المحدد المعرفة صناعة حمض الفوسفور المزيد..

Students Who Study Together Learn Better: On the Importance of Collective Knowledge Distillation for Domain Transfer in Fact Verification

309 - Association for Computation Linguistics 2021 مقالة

While neural networks produce state-of-the- art performance in several NLP tasks, they generally depend heavily on lexicalized information, which transfer poorly between domains. Previous works have proposed delexicalization as a form of knowledge di stillation to reduce the dependency on such lexical artifacts. However, a critical unsolved issue that remains is how much delexicalization to apply: a little helps reduce overfitting, but too much discards useful information. We propose Group Learning, a knowledge and model distillation approach for fact verification in which multiple student models have access to different delexicalized views of the data, but are encouraged to learn from each other through pair-wise consistency losses. In several cross-domain experiments between the FEVER and FNC fact verification datasets, we show that our approach learns the best delexicalization strategy for the given training dataset, and outperforms state-of-the-art classifiers that rely on the original data.

importance of collective collective knowledge distillation collective knowledge أهمية الجماعية تقارير المعرفة الجماعية المعرفة الجماعية صناعة حمض الفوسفور المزيد..

Fine-grained Typing of Emerging Entities in Microblogs

342 - Association for Computation Linguistics 2021 مقالة

Analyzing microblogs where we post what we experience enables us to perform various applications such as social-trend analysis and entity recommendation. To track emerging trends in a variety of areas, we want to categorize information on emerging en tities (e.g., Avatar 2) in microblog posts according to their types (e.g., Film). We thus introduce a new entity typing task that assigns a fine-grained type to each emerging entity when a burst of posts containing that entity is first observed in a microblog. The challenge is to perform typing from noisy microblog posts without relying on prior knowledge of the target entity. To tackle this task, we build large-scale Twitter datasets for English and Japanese using time-sensitive distant supervision. We then propose a modular neural typing model that encodes not only the entity and its contexts but also meta information in multiple posts. To type homographic' emerging entities (e.g., Go' means an emerging programming language and a classic board game), which contexts are noisy, we devise a context selector that finds related contexts of the target entity. Experiments on the Twitter datasets confirm the effectiveness of our typing model and the context selector.

emerging entities emerging الكيانات الناشئة المستجدة صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Knowledge Enhanced Fine-Tuning for Better Handling Unseen Entities in Dialogue Generation

المعرفة عززت الطاقة الجميلة للتعامل بشكل أفضل كيانات غير مرئية في توليد الحوار

Ask ChatGPT about the research

Read More

suggested questions