CatVRNN: Generating Category Texts via Multi-task Learning

417 0 0.0 ( 0 )

Download Cite

Added by Pengsen Cheng

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Pengsen Cheng - Jiayong Liu - Jinqiao Dai

Computation and Language

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Controlling the model to generate texts of different categories is a challenging task that is getting more and more attention. Recently, generative adversarial net (GAN) has shown promising results in category text generation. However, the texts generated by GANs usually suffer from the problems of mode collapse and training instability. To avoid the above problems, we propose a novel model named category-aware variational recurrent neural network (CatVRNN), which is inspired by multi-task learning. In our model, generation and classification are trained simultaneously, aiming at generating texts of different categories. Moreover, the use of multi-task learning can improve the quality of generated texts, when the classification task is appropriate. And we propose a function to initialize the hidden state of CatVRNN to force model to generate texts of a specific category. Experimental results on three datasets demonstrate that our model can do better than several state-of-the-art text generation methods based GAN in the category accuracy and quality of generated texts.

rate research

Generating Personalized Dialogue via Multi-Task Meta-Learning

132 - Jing Yang Lee , Kong Aik Lee , Woon Seng Gan 2021

Conventional approaches to personalized dialogue generation typically require a large corpus, as well as predefined persona information. However, in a real-world setting, neither a large corpus of training data nor persona information are readily available. To address these practical limitations, we propose a novel multi-task meta-learning approach which involves training a model to adapt to new personas without relying on a large corpus, or on any predefined persona information. Instead, the model is tasked with generating personalized responses based on only the dialogue context. Unlike prior work, our approach leverages on the provided persona information only during training via the introduction of an auxiliary persona reconstruction task. In this paper, we introduce 2 frameworks that adopt the proposed multi-task meta-learning approach: the Multi-Task Meta-Learning (MTML) framework, and the Alternating Multi-Task Meta-Learning (AMTML) framework. Experimental results show that utilizing MTML and AMTML results in dialogue responses with greater persona consistency.

Computation and Language Artificial Intelligence

Generating Informative Conclusions for Argumentative Texts

88 - Shahbaz Syed , Khalid Al-Khatib , Milad Alshomary 2021

The purpose of an argumentative text is to support a certain conclusion. Yet, they are often omitted, expecting readers to infer them rather. While appropriate when reading an individual text, this rhetorical device limits accessibility when browsing many texts (e.g., on a search engine or on social media). In these scenarios, an explicit conclusion makes for a good candidate summary of an argumentative text. This is especially true if the conclusion is informative, emphasizing specific concepts from the text. With this paper we introduce the task of generating informative conclusions: First, Webis-ConcluGen-21 is compiled, a large-scale corpus of 136,996 samples of argumentative texts and their conclusions. Second, two paradigms for conclusion generation are investigated; one extractive, the other abstractive in nature. The latter exploits argumentative knowledge that augment the data via control codes and finetuning the BART model on several subsets of the corpus. Third, insights are provided into the suitability of our corpus for the task, the differences between the two generation paradigms, the trade-off between informativeness and conciseness, and the impact of encoding argumentative knowledge. The corpus, code, and the trained models are publicly available.

Computation and Language

Automatic Document Sketching: Generating Drafts from Analogous Texts

142 - Zeqiu Wu , Michel Galley , Chris Brockett 2021

The advent of large pre-trained language models has made it possible to make high-quality predictions on how to add or change a sentence in a document. However, the high branching factor inherent to text generation impedes the ability of even the strongest language models to offer useful editing suggestions at a more global or document level. We introduce a new task, document sketching, which involves generating entire draft documents for the writer to review and revise. These drafts are built from sets of documents that overlap in form - sharing large segments of potentially reusable text - while diverging in content. To support this task, we introduce a Wikipedia-based dataset of analogous documents and investigate the application of weakly supervised methods, including use of a transformer-based mixture of experts, together with reinforcement learning. We report experiments using automated and human evaluation methods and discuss relative merits of these models.

Computation and Language

Adjacency List Oriented Relational Fact Extraction via Adaptive Multi-task Learning

259 - Fubang Zhao , Zhuoren Jiang , Yangyang Kang 2021

Relational fact extraction aims to extract semantic triplets from unstructured text. In this work, we show that all of the relational fact extraction models can be organized according to a graph-oriented analytical perspective. An efficient model, aDjacency lIst oRiented rElational faCT (DIRECT), is proposed based on this analytical framework. To alleviate challenges of error propagation and sub-task loss equilibrium, DIRECT employs a novel adaptive multi-task learning strategy with dynamic sub-task loss balancing. Extensive experiments are conducted on two benchmark datasets, and results prove that the proposed model outperforms a series of state-of-the-art (SoTA) models for relational triplet extraction.

Computation and Language

Learning Cross-Lingual Sentence Representations via a Multi-task Dual-Encoder Model

127 - Muthuraman Chidambaram , Yinfei Yang , Daniel Cer 2018

A significant roadblock in multilingual neural language modeling is the lack of labeled non-English data. One potential method for overcoming this issue is learning cross-lingual text representations that can be used to transfer the performance from training on English tasks to non-English tasks, despite little to no task-specific non-English data. In this paper, we explore a natural setup for learning cross-lingual sentence representations: the dual-encoder. We provide a comprehensive evaluation of our cross-lingual representations on a number of monolingual, cross-lingual, and zero-shot/few-shot learning tasks, and also give an analysis of different learned cross-lingual embedding spaces.

Computation and Language