No Arabic abstract
Long-form narrative text generated from large language models manages a fluent impersonation of human writing, but only at the local sentence level, and lacks structure or global cohesion. We posit that many of the problems of story generation can be addressed via high-quality content planning, and present a system that focuses on how to learn good plot structures to guide story generation. We utilize a plot-generation language model along with an ensemble of rescoring models that each implement an aspect of good story-writing as detailed in Aristotles Poetics. We find that stories written with our more principled plot-structure are both more relevant to a given prompt and higher quality than baselines that do not content plan, or that plan in an unprincipled way.
Current storytelling systems focus more ongenerating stories with coherent plots regard-less of the narration style, which is impor-tant for controllable text generation. There-fore, we propose a new task, stylized story gen-eration, namely generating stories with speci-fied style given a leading context. To tacklethe problem, we propose a novel generationmodel that first plans the stylized keywordsand then generates the whole story with theguidance of the keywords. Besides, we pro-pose two automatic metrics to evaluate theconsistency between the generated story andthe specified style. Experiments demonstratesthat our model can controllably generateemo-tion-driven orevent-driven stories based onthe ROCStories dataset (Mostafazadeh et al.,2016). Our study presents insights for stylizedstory generation in further research.
Neural data-to-text generation models have achieved significant advancement in recent years. However, these models have two shortcomings: the generated texts tend to miss some vital information, and they often generate descriptions that are not consistent with the structured input data. To alleviate these problems, we propose a Neural data-to-text generation model with Dynamic content Planning, named NDP for abbreviation. The NDP can utilize the previously generated text to dynamically select the appropriate entry from the given structured data. We further design a reconstruction mechanism with a novel objective function that can reconstruct the whole entry of the used data sequentially from the hidden states of the decoder, which aids the accuracy of the generated text. Empirical results show that the NDP achieves superior performance over the state-of-the-art on ROTOWIRE dataset, in terms of relation generation (RG), content selection (CS), content ordering (CO) and BLEU metrics. The human evaluation result shows that the texts generated by the proposed NDP are better than the corresponding ones generated by NCP in most of time. And using the proposed reconstruction mechanism, the fidelity of the generated text can be further improved significantly.
Large-scale pretrained language models have shown thrilling generation capabilities, especially when they generate consistent long text in thousands of words with ease. However, users of these models can only control the prefix of sentences or certain global aspects of generated text. It is challenging to simultaneously achieve fine-grained controllability and preserve the state-of-the-art unconditional text generation capability. In this paper, we first propose a new task named Outline to Story (O2S) as a test bed for fine-grained controllable generation of long text, which generates a multi-paragraph story from cascaded events, i.e. a sequence of outline events that guide subsequent paragraph generation. We then create dedicate datasets for future benchmarks, built by state-of-the-art keyword extraction techniques. Finally, we propose an extremely simple yet strong baseline method for the O2S task, which fine tunes pre-trained language models on augmented sequences of outline-story pairs with simple language modeling objective. Our method does not introduce any new parameters or perform any architecture modification, except several special tokens as delimiters to build augmented sequences. Extensive experiments on various datasets demonstrate state-of-the-art conditional story generation performance with our model, achieving better fine-grained controllability and user flexibility. Our paper is among the first ones by our knowledge to propose a model and to create datasets for the task of outline to story. Our work also instantiates research interest of fine-grained controllable generation of open-domain long text, where controlling inputs are represented by short text.
We investigate large-scale latent variable models (LVMs) for neural story generation -- an under-explored application for open-domain long text -- with objectives in two threads: generation effectiveness and controllability. LVMs, especially the variational autoencoder (VAE), have achieved both effective and controllable generation through exploiting flexible distributional latent representations. Recently, Transformers and its variants have achieved remarkable effectiveness without explicit latent representation learning, thus lack satisfying controllability in generation. In this paper, we advocate to revive latent variable modeling, essentially the power of representation learning, in the era of Transformers to enhance controllability without hurting state-of-the-art generation effectiveness. Specifically, we integrate latent representation vectors with a Transformer-based pre-trained architecture to build conditional variational autoencoder (CVAE). Model components such as encoder, decoder and the variational posterior are all built on top of pre-trained language models -- GPT2 specifically in this paper. Experiments demonstrate state-of-the-art conditional generation ability of our model, as well as its excellent representation learning capability and controllability.
Data-to-text generation can be conceptually divided into two parts: ordering and structuring the information (planning), and generating fluent language describing the information (realization). Modern neural generation systems conflate these two steps into a single end-to-end differentiable system. We propose to split the generation process into a symbolic text-planning stage that is faithful to the input, followed by a neural generation stage that focuses only on realization. For training a plan-to-text generator, we present a method for matching reference texts to their corresponding text plans. For inference time, we describe a method for selecting high-quality text plans for new inputs. We implement and evaluate our approach on the WebNLG benchmark. Our results demonstrate that decoupling text planning from neural realization indeed improves the systems reliability and adequacy while maintaining fluent output. We observe improvements both in BLEU scores and in manual evaluations. Another benefit of our approach is the ability to output diverse realizations of the same input, paving the way to explicit control over the generated text structure.