New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Augmenting Transformers with KNN-Based Composite Memory for Dialog

زيادة المحولات مع الذاكرة المركبة القائمة على KNN للحوار

490 0 3 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Various machine learning tasks can benefit from access to external information of different modalities, such as text and images. Recent work has focused on learning architectures with large memories capable of storing this knowledge. We propose augmenting generative Transformer neural networks with KNN-based Information Fetching (KIF) modules. Each KIF module learns a read operation to access fixed external knowledge. We apply these modules to generative dialog modeling, a challenging task where information must be flexibly retrieved and incorporated to maintain the topic and flow of conversation. We demonstrate the effectiveness of our approach by identifying relevant knowledge required for knowledgeable but engaging dialog from Wikipedia, images, and human-written dialog utterances, and show that leveraging this retrieved information improves model performance, measured by automatic and human evaluation.

References used

https://aclanthology.org/

rate research

On the Usability of Transformers-based Models for a French Question-Answering Task

366 - Association for Computation Linguistics 2021 مقالة

For many tasks, state-of-the-art results have been achieved with Transformer-based architectures, resulting in a paradigmatic shift in practices from the use of task-specific architectures to the fine-tuning of pre-trained language models. The ongoin g trend consists in training models with an ever-increasing amount of data and parameters, which requires considerable resources. It leads to a strong search to improve resource efficiency based on algorithmic and hardware improvements evaluated only for English. This raises questions about their usability when applied to small-scale learning problems, for which a limited amount of training data is available, especially for under-resourced languages tasks. The lack of appropriately sized corpora is a hindrance to applying data-driven and transfer learning-based approaches with strong instability cases. In this paper, we establish a state-of-the-art of the efforts dedicated to the usability of Transformer-based models and propose to evaluate these improvements on the question-answering performances of French language which have few resources. We address the instability relating to data scarcity by investigating various training strategies with data augmentation, hyperparameters optimization and cross-lingual transfer. We also introduce a new compact model for French FrALBERT which proves to be competitive in low-resource settings.

transformers-based models usability of transformers-based french question-answering task الموديلات القائمة على المحولات قابلية الاستخدام للمحولات مهمة الإجابة على الأسئلة الفرنسية صناعة حمض الفوسفور المزيد..

Memory-Based Semantic Parsing

315 - Association for Computation Linguistics 2021 مقالة

Abstract We present a memory-based model for context- dependent semantic parsing. Previous approaches focus on enabling the decoder to copy or modify the parse from the previous utterance, assuming there is a dependency between the current and previo us parses. In this work, we propose to represent contextual information using an external memory. We learn a context memory controller that manages the memory by maintaining the cumulative meaning of sequential user utterances. We evaluate our approach on three semantic parsing benchmarks. Experimental results show that our model can better process context-dependent information and demonstrates improved performance without using task-specific decoders.

semantic parsing memory-based semantic parsing dependent semantic parsing تحليل الدلالي تحليل الدلالات المستندة إلى الذاكرة تحليل الدلالي المعتمدة صناعة حمض الفوسفور المزيد..

Memory-efficient Transformers via Top-k Attention

489 - Association for Computation Linguistics 2021 مقالة

Following the success of dot-product attention in Transformers, numerous approximations have been recently proposed to address its quadratic complexity with respect to the input length. While these variants are memory and compute efficient, it is not possible to directly use them with popular pre-trained language models trained using vanilla attention, without an expensive corrective pre-training stage. In this work, we propose a simple yet highly accurate approximation for vanilla attention. We process the queries in chunks, and for each query, compute the top-*k* scores with respect to the keys. Our approach offers several advantages: (a) its memory usage is linear in the input size, similar to linear attention variants, such as Performer and RFA (b) it is a drop-in replacement for vanilla attention that does not require any corrective pre-training, and (c) it can also lead to significant memory savings in the feed-forward layers after casting them into the familiar query-key-value framework. We evaluate the quality of top-*k* approximation for multi-head attention layers on the Long Range Arena Benchmark, and for feed-forward layers of T5 and UnifiedQA on multiple QA datasets. We show our approach leads to accuracy that is nearly-identical to vanilla attention in multiple setups including training from scratch, fine-tuning, and zero-shot inference.

memory-efficient transformers transformers via top-k top-k attention محولات كفاءة الذاكرة المحولات عبر Top-K اهتمام Top-K صناعة حمض الفوسفور المزيد..

Enhancing Visual Dialog Questioner with Entity-based Strategy Learning and Augmented Guesser

257 - Association for Computation Linguistics 2021 مقالة

Considering the importance of building a good Visual Dialog (VD) Questioner, many researchers study the topic under a Q-Bot-A-Bot image-guessing game setting, where the Questioner needs to raise a series of questions to collect information of an undi sclosed image. Despite progress has been made in Supervised Learning (SL) and Reinforcement Learning (RL), issues still exist. Firstly, previous methods do not provide explicit and effective guidance for Questioner to generate visually related and informative questions. Secondly, the effect of RL is hampered by an incompetent component, i.e., the Guesser, who makes image predictions based on the generated dialogs and assigns rewards accordingly. To enhance VD Questioner: 1) we propose a Related entity enhanced Questioner (ReeQ) that generates questions under the guidance of related entities and learns entity-based questioning strategy from human dialogs; 2) we propose an Augmented Guesser that is strong and is optimized for VD especially. Experimental results on the VisDial v1.0 dataset show that our approach achieves state-of-the-art performance on both image-guessing task and question diversity. Human study further verifies that our model generates more visually related, informative and coherent questions.

enhancing visual dialog visual dialog questioner تعزيز مربع الحوار البصري سؤال الحوار المرئي صناعة حمض الفوسفور

Augmenting Knowledge-grounded Conversations with Sequential Knowledge Transition

478 - Association for Computation Linguistics 2021 مقالة

Knowledge data are massive and widespread in the real-world, which can serve as good external sources to enrich conversations. However, in knowledge-grounded conversations, current models still lack the fine-grained control over knowledge selection a nd integration with dialogues, which finally leads to the knowledge-irrelevant response generation problems: 1) knowledge selection merely relies on the dialogue context, ignoring the inherent knowledge transitions along with conversation flows; 2) the models often over-fit during training, resulting with incoherent response by referring to unrelated tokens from specific knowledge content in the testing phase; 3) although response is generated upon the dialogue history and knowledge, the models often tend to overlook the selected knowledge, and hence generates knowledge-irrelevant response. To address these problems, we proposed to explicitly model the knowledge transition in sequential multi-turn conversations by abstracting knowledge into topic tags. Besides, to fully utilizing the selected knowledge in generative process, we propose pre-training a knowledge-aware response generator to pay more attention on the selected knowledge. In particular, a sequential knowledge transition model equipped with a pre-trained knowledge-aware response generator (SKT-KG) formulates the high-level knowledge transition and fully utilizes the limited knowledge data. Experimental results on both structured and unstructured knowledge-grounded dialogue benchmarks indicate that our model achieves better performance over baseline models.

augmenting knowledge-grounded conversations sequential knowledge transition زيادة المحادثات المعرفة انتقال المعرفة المتسلسل صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Augmenting Transformers with KNN-Based Composite Memory for Dialog

زيادة المحولات مع الذاكرة المركبة القائمة على KNN للحوار

Ask ChatGPT about the research

Read More

suggested questions