New community

Subscribe to the gold package and get unlimited access to Shamra Academy

RAST: Domain-Robust Dialogue Rewriting as Sequence Tagging

RAST: إعادة كتابة الحوار القوي المجال كعلامات تسلسل

317 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The task of dialogue rewriting aims to reconstruct the latest dialogue utterance by copying the missing content from the dialogue context. Until now, the existing models for this task suffer from the robustness issue, i.e., performances drop dramatically when testing on a different dataset. We address this robustness issue by proposing a novel sequence-tagging-based model so that the search space is significantly reduced, yet the core of this task is still well covered. As a common issue of most tagging models for text generation, the model's outputs may lack fluency. To alleviate this issue, we inject the loss signal from BLEU or GPT-2 under a REINFORCE framework. Experiments show huge improvements of our model over the current state-of-the-art systems when transferring to another dataset.

References used

https://aclanthology.org/

rate research

Open-Domain Question Answering Goes Conversational via Question Rewriting

372 - Association for Computation Linguistics 2021 مقالة

We introduce a new dataset for Question Rewriting in Conversational Context (QReCC), which contains 14K conversations with 80K question-answer pairs. The task in QReCC is to find answers to conversational questions within a collection of 10M web page s (split into 54M passages). Answers to questions in the same conversation may be distributed across several web pages. QReCC provides annotations that allow us to train and evaluate individual subtasks of question rewriting, passage retrieval and reading comprehension required for the end-to-end conversational question answering (QA) task. We report the effectiveness of a strong baseline approach that combines the state-of-the-art model for question rewriting, and competitive models for open-domain QA. Our results set the first baseline for the QReCC dataset with F1 of 19.10, compared to the human upper bound of 75.45, indicating the difficulty of the setup and a large room for improvement.

question rewriting conversational question answering السؤال إعادة كتابة إجابة سؤال المحادثة صناعة حمض الفوسفور

Graph Rewriting for Enhanced Universal Dependencies

590 - Association for Computation Linguistics 2021 مقالة

This paper describes a system proposed for the IWPT 2021 Shared Task on Parsing into Enhanced Universal Dependencies (EUD). We propose a Graph Rewriting based system for computing Enhanced Universal Dependencies, given the Basic Universal Dependencies (UD).

المهام المشتركة EUD basic universal dependencies التبعيات العالمية الأساسية صناعة حمض الفوسفور

Multi-Sentence Knowledge Selection in Open-Domain Dialogue

726 - Association for Computation Linguistics 2021 مقالة

Incorporating external knowledge sources effectively in conversations is a longstanding problem in open-domain dialogue research. The existing literature on open-domain knowledge selection is limited and makes certain brittle assumptions on knowledge sources to simplify the overall task, such as the existence of a single relevant knowledge sentence per context. In this work, we evaluate the existing state of open-domain conversation knowledge selection, showing where the existing methodologies regarding data and evaluation are flawed. We then improve on them by proposing a new framework for collecting relevant knowledge, and create an augmented dataset based on the Wizard of Wikipedia (WOW) corpus, which we call WOW++. WOW++ averages 8 relevant knowledge sentences per dialogue context, embracing the inherent ambiguity of open-domain dialogue knowledge selection. We then benchmark various knowledge ranking algorithms on this augmented dataset with both intrinsic evaluation and extrinsic measures of response quality, showing that neural rerankers that use WOW++ can outperform rankers trained on standard datasets.

multi-sentence knowledge selection open-domain dialogue knowledge selection اختيار المعرفة متعددة الجملة الحوار مفتوح المجال اختيار المعرفة صناعة حمض الفوسفور المزيد..

Extend, don't rebuild: Phrasing conditional graph modification as autoregressive sequence labelling

396 - Association for Computation Linguistics 2021 مقالة

Deriving and modifying graphs from natural language text has become a versatile basis technology for information extraction with applications in many subfields, such as semantic parsing or knowledge graph construction. A recent work used this techniq ue for modifying scene graphs (He et al. 2020), by first encoding the original graph and then generating the modified one based on this encoding. In this work, we show that we can considerably increase performance on this problem by phrasing it as graph extension instead of graph generation. We propose the first model for the resulting graph extension problem based on autoregressive sequence labelling. On three scene graph modification data sets, this formulation leads to improvements in accuracy over the state-of-the-art between 13 and 24 percentage points. Furthermore, we introduce a novel data set from the biomedical domain which has much larger linguistic variability and more complex graphs than the scene graph modification data sets. For this data set, the state-of-the art fails to generalize, while our model can produce meaningful predictions.

phrasing conditional graph conditional graph modification الصياغة الرسمية الرسم البياني رسم بياني تعديل الرسم البياني الشرطي صناعة حمض الفوسفور

Effective Sequence-to-Sequence Dialogue State Tracking

301 - Association for Computation Linguistics 2021 مقالة

Sequence-to-sequence models have been applied to a wide variety of NLP tasks, but how to properly use them for dialogue state tracking has not been systematically investigated. In this paper, we study this problem from the perspectives of pre-trainin g objectives as well as the formats of context representations. We demonstrate that the choice of pre-training objective makes a significant difference to the state tracking quality. In particular, we find that masked span prediction is more effective than auto-regressive language modeling. We also explore using Pegasus, a span prediction-based pre-training objective for text summarization, for the state tracking model. We found that pre-training for the seemingly distant summarization task works surprisingly well for dialogue state tracking. In addition, we found that while recurrent state context representation works also reasonably well, the model may have a hard time recovering from earlier mistakes. We conducted experiments on the MultiWOZ 2.1-2.4, WOZ 2.0, and DSTC2 datasets with consistent observations.

بناء اللغة التصويرية صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

RAST: Domain-Robust Dialogue Rewriting as Sequence Tagging

RAST: إعادة كتابة الحوار القوي المجال كعلامات تسلسل

Ask ChatGPT about the research

Read More

suggested questions