New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Choose Your Own Adventure: Paired Suggestions in Collaborative Writing for Evaluating Story Generation Models

اختر المغامرة الخاصة بك: اقتراحات مقدمة في الكتابة التعاونية لتقييم نماذج جيل القصة

263 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Story generation is an open-ended and subjective task, which poses a challenge for evaluating story generation models. We present Choose Your Own Adventure, a collaborative writing setup for pairwise model evaluation. Two models generate suggestions to people as they write a short story; we ask writers to choose one of the two suggestions, and we observe which model's suggestions they prefer. The setup also allows further analysis based on the revisions people make to the suggestions. We show that these measures, combined with automatic metrics, provide an informative picture of the models' performance, both in cases where the differences in generation methods are small (nucleus vs. top-k sampling) and large (GPT2 vs. Fusion models).

References used

https://aclanthology.org/

rate research

Semantic Answer Similarity for Evaluating Question Answering Models

338 - Association for Computation Linguistics 2021 مقالة

The evaluation of question answering models compares ground-truth annotations with model predictions. However, as of today, this comparison is mostly lexical-based and therefore misses out on answers that have no lexical overlap but are still semanti cally similar, thus treating correct answers as false. This underestimation of the true performance of models hinders user acceptance in applications and complicates a fair comparison of different models. Therefore, there is a need for an evaluation metric that is based on semantics instead of pure string similarity. In this short paper, we present SAS, a cross-encoder-based metric for the estimation of semantic answer similarity, and compare it to seven existing metrics. To this end, we create an English and a German three-way annotated evaluation dataset containing pairs of answers along with human judgment of their semantic similarity, which we release along with an implementation of the SAS metric and the experiments. We find that semantic similarity metrics based on recent transformer models correlate much better with human judgment than traditional lexical similarity metrics on our two newly created datasets and one dataset from related work.

evaluating question answering evaluating question تقييم الإجابة على السؤال تقييم السؤال صناعة حمض الفوسفور

How do people interact with biased text prediction models while writing?

282 - Association for Computation Linguistics 2021 مقالة

Recent studies have shown that a bias in thetext suggestions system can percolate in theuser's writing. In this pilot study, we ask thequestion: How do people interact with text pre-diction models, in an inline next phrase sugges-tion interface and h ow does introducing senti-ment bias in the text prediction model affecttheir writing? We present a pilot study as afirst step to answer this question.

biased text prediction people interact interact with biased تحيز التنبؤ النص تفاعل الناس التفاعل مع منحازة صناعة حمض الفوسفور المزيد..

Automatic Story Generation: Challenges and Attempts

326 - Association for Computation Linguistics 2021 مقالة

Automated storytelling has long captured the attention of researchers for the ubiquity of narratives in everyday life. The best human-crafted stories exhibit coherent plot, strong characters, and adherence to genres, attributes that current states-of -the-art still struggle to produce, even using transformer architectures. In this paper, we analyze works in story generation that utilize machine learning approaches to (1) address story generation controllability, (2) incorporate commonsense knowledge, (3) infer reasonable character actions, and (4) generate creative language.

challenges and attempts automatic story generation story generation التحديات والمحاولات توليد القصة التلقائي جيل القصة صناعة حمض الفوسفور المزيد..

Paired Examples as Indirect Supervision in Latent Decision Models

258 - Association for Computation Linguistics 2021 مقالة

Compositional, structured models are appealing because they explicitly decompose problems and provide interpretable intermediate outputs that give confidence that the model is not simply latching onto data artifacts. Learning these models is challeng ing, however, because end-task supervision only provides a weak indirect signal on what values the latent decisions should take. This often results in the model failing to learn to perform the intermediate tasks correctly. In this work, we introduce a way to leverage paired examples that provide stronger cues for learning latent decisions. When two related training examples share internal substructure, we add an additional training objective to encourage consistency between their latent decisions. Such an objective does not require external supervision for the values of the latent output, or even the end task, yet provides an additional training signal to that provided by individual training examples themselves. We apply our method to improve compositional question answering using neural module networks on the DROP dataset. We explore three ways to acquire paired questions in DROP: (a) discovering naturally occurring paired examples within the dataset, (b) constructing paired examples using templates, and (c) generating paired examples using a question generation model. We empirically demonstrate that our proposed approach improves both in- and out-of-distribution generalization and leads to correct latent decision predictions.

latent decisions latent latent decision models قرارات كامنة كامنة نماذج القرارات الكامنة صناعة حمض الفوسفور المزيد..

GraphPlan: Story Generation by Planning with Event Graph

351 - Association for Computation Linguistics 2021 مقالة

Story generation is a task that aims to automatically generate a meaningful story. This task is challenging because it requires high-level understanding of the semantic meaning of sentences and causality of story events. Naivesequence-to-sequence mod els generally fail to acquire such knowledge, as it is difficult to guarantee logical correctness in a text generation model without strategic planning. In this study, we focus on planning a sequence of events assisted by event graphs and use the events to guide the generator. Rather than using a sequence-to-sequence model to output a sequence, as in some existing works, we propose to generate an event sequence by walking on an event graph. The event graphs are built automatically based on the corpus. To evaluate the proposed approach, we incorporate human participation, both in event planning and story generation. Based on the largescale human annotation results, our proposed approach has been shown to provide more logically correct event sequences and stories compared with previous approaches.

موازنة المفاضلات صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Choose Your Own Adventure: Paired Suggestions in Collaborative Writing for Evaluating Story Generation Models

اختر المغامرة الخاصة بك: اقتراحات مقدمة في الكتابة التعاونية لتقييم نماذج جيل القصة

Ask ChatGPT about the research

Read More

suggested questions