TransSent: Towards Generation of Structured Sentences with Discourse Marker

79 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Wu Xing

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Xing Wu - Dongjun Wei - Liangjun Zang

الحساب واللغة الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Structured sentences are important expressions in human writings and dialogues. Previous works on neural text generation fused semantic and structural information by encoding the entire sentence into a mixed hidden representation. However, when a generated sentence becomes complicated, the structure is difficult to be properly maintained. To alleviate this problem, we explicitly separate the modeling process of semantic and structural information. Intuitively, humans generate structured sentences by directly connecting discourses with discourse markers (such as and, but, etc.). Therefore, we propose a task that mimics this process, called discourse transfer. This task represents a structured sentence as (head discourse, discourse marker, tail discourse), and aims at tail discourse generation based on head discourse and discourse marker. We also propose a corresponding model called TransSent, which interprets the relationship between two discourses as a translation1 from the head discourse to the tail discourse in the embedding space. We experiment TransSent not only in discourse transfer task but also in free text generation and dialogue generation tasks. Automatic and human evaluation results show that TransSent can generate structured sentences with high quality, and has certain scalability in different tasks.

قيم البحث

72 - Boyuan Pan , Yazheng Yang , Zhou Zhao 2019

Natural Language Inference (NLI), also known as Recognizing Textual Entailment (RTE), is one of the most important problems in natural language processing. It requires to infer the logical relationship between two given sentences. While current appro aches mostly focus on the interaction architectures of the sentences, in this paper, we propose to transfer knowledge from some important discourse markers to augment the quality of the NLI model. We observe that people usually use some discourse markers such as so or but to represent the logical relationship between two sentences. These words potentially have deep connections with the meanings of the sentences, thus can be utilized to help improve the representations of them. Moreover, we use reinforcement learning to optimize a new objective function with a reward defined by the property of the NLI datasets to make full use of the labels information. Experiments show that our method achieves the state-of-the-art performance on several large-scale datasets.

الحساب واللغة الذكاء الاصطناعي التعلم الآلي

Randomized Deep Structured Prediction for Discourse-Level Processing

112 - Manuel Widmoser , Maria Leonor Pacheco , Jean Honorio 2021

Expressive text encoders such as RNNs and Transformer Networks have been at the center of NLP models in recent work. Most of the effort has focused on sentence-level tasks, capturing the dependencies between words in a single sentence, or pairs of se ntences. However, certain tasks, such as argumentation mining, require accounting for longer texts and complicated structural dependencies between them. Deep structured prediction is a general framework to combine the complementary strengths of expressive neural encoders and structured inference for highly structured domains. Nevertheless, when the need arises to go beyond sentences, most work relies on combining the output scores of independently trained classifiers. One of the main reasons for this is that constrained inference comes at a high computational cost. In this paper, we explore the use of randomized inference to alleviate this concern and show that we can efficiently leverage deep structured prediction and expressive neural encoders for a set of tasks involving complicated argumentative structures.

الحساب واللغة الذكاء الاصطناعي التعلم الآلي

AgentGraph: Towards Universal Dialogue Management with Structured Deep Reinforcement Learning

138 - Lu Chen , Zhi Chen , Bowen Tan 2019

Dialogue policy plays an important role in task-oriented spoken dialogue systems. It determines how to respond to users. The recently proposed deep reinforcement learning (DRL) approaches have been used for policy optimization. However, these deep mo dels are still challenging for two reasons: 1) Many DRL-based policies are not sample-efficient. 2) Most models dont have the capability of policy transfer between different domains. In this paper, we propose a universal framework, AgentGraph, to tackle these two problems. The proposed AgentGraph is the combination of GNN-based architecture and DRL-based algorithm. It can be regarded as one of the multi-agent reinforcement learning approaches. Each agent corresponds to a node in a graph, which is defined according to the dialogue domain ontology. When making a decision, each agent can communicate with its neighbors on the graph. Under AgentGraph framework, we further propose Dual GNN-based dialogue policy, which implicitly decomposes the decision in each turn into a high-level global decision and a low-level local decision. Experiments show that AgentGraph models significantly outperform traditional reinforcement learning approaches on most of the 18 tasks of the PyDial benchmark. Moreover, when transferred from the source task to a target task, these models not only have acceptable initial performance but also converge much faster on the target task.

الحساب واللغة الذكاء الاصطناعي التعلم الآلي

A Sequential Neural Encoder with Latent Structured Description for Modeling Sentences

63 - Yu-Ping Ruan , Qian Chen , 2017

In this paper, we propose a sequential neural encoder with latent structured description (SNELSD) for modeling sentences. This model introduces latent chunk-level representations into conventional sequential neural encoders, i.e., recurrent neural ne tworks (RNNs) with long short-term memory (LSTM) units, to consider the compositionality of languages in semantic modeling. An SNELSD model has a hierarchical structure that includes a detection layer and a description layer. The detection layer predicts the boundaries of latent word chunks in an input sentence and derives a chunk-level vector for each word. The description layer utilizes modified LSTM units to process these chunk-level vectors in a recurrent manner and produces sequential encoding outputs. These output vectors are further concatenated with word vectors or the outputs of a chain LSTM encoder to obtain the final sentence representation. All the model parameters are learned in an end-to-end manner without a dependency on additional text chunking or syntax parsing. A natural language inference (NLI) task and a sentiment analysis (SA) task are adopted to evaluate the performance of our proposed model. The experimental results demonstrate the effectiveness of the proposed SNELSD model on exploring task-dependent chunking patterns during the semantic modeling of sentences. Furthermore, the proposed method achieves better performance than conventional chain LSTMs and tree-structured LSTMs on both tasks.

الحساب واللغة

DialogBERT: Discourse-Aware Response Generation via Learning to Recover and Rank Utterances

249 - Xiaodong Gu , Kang Min Yoo , Jung-Woo Ha 2020

Recent advances in pre-trained language models have significantly improved neural response generation. However, existing methods usually view the dialogue context as a linear sequence of tokens and learn to generate the next word through token-level self-attention. Such token-level encoding hinders the exploration of discourse-level coherence among utterances. This paper presents DialogBERT, a novel conversational response generation model that enhances previous PLM-based dialogue models. DialogBERT employs a hierarchical Transformer architecture. To efficiently capture the discourse-level coherence among utterances, we propose two training objectives, including masked utterance regression and distributed utterance order ranking in analogy to the original BERT training. Experiments on three multi-turn conversation datasets show that our approach remarkably outperforms the baselines, such as BART and DialoGPT, in terms of quantitative evaluation. The human evaluation suggests that DialogBERT generates more coherent, informative, and human-like responses than the baselines with significant margins.

الحساب واللغة الذكاء الاصطناعي التعلم الآلي