توليد القصة هي مهمة مفتوحة وعشرية، مما يشكل تحديا لتقييم نماذج جيل القصة.نقدم اختبار المغامرة الخاصة بك، إعداد الكتابة التعاوني لتقييم نموذج الزوجي.تولد طرازان اقتراحات للناس لأنهم يكتبون قصة قصيرة؛نطلب من الكتاب اختيار أحد الاقتراحين، ونحن نلاحظ اقتراحات النموذج التي يفضلونها.كما يتيح الإعداد أيضا إجراء مزيد من التحليل بناء على المراجعات التي يقوم بها الناس إلى الاقتراحات.نظظ أن هذه التدابير، إلى جانب المقاييس التلقائية، توفر صورة إعلامية لأداء النماذج، سواء في الحالات التي تكون فيها الاختلافات في طرق التوليد صغيرة (عينة من أعلى النواة مقابل Top-K) وكبير (نماذج Fusion Fusion)وبعد
Story generation is an open-ended and subjective task, which poses a challenge for evaluating story generation models. We present Choose Your Own Adventure, a collaborative writing setup for pairwise model evaluation. Two models generate suggestions to people as they write a short story; we ask writers to choose one of the two suggestions, and we observe which model's suggestions they prefer. The setup also allows further analysis based on the revisions people make to the suggestions. We show that these measures, combined with automatic metrics, provide an informative picture of the models' performance, both in cases where the differences in generation methods are small (nucleus vs. top-k sampling) and large (GPT2 vs. Fusion models).
References used
https://aclanthology.org/
The evaluation of question answering models compares ground-truth annotations with model predictions. However, as of today, this comparison is mostly lexical-based and therefore misses out on answers that have no lexical overlap but are still semanti
Recent studies have shown that a bias in thetext suggestions system can percolate in theuser's writing. In this pilot study, we ask thequestion: How do people interact with text pre-diction models, in an inline next phrase sugges-tion interface and h
Automated storytelling has long captured the attention of researchers for the ubiquity of narratives in everyday life. The best human-crafted stories exhibit coherent plot, strong characters, and adherence to genres, attributes that current states-of
Compositional, structured models are appealing because they explicitly decompose problems and provide interpretable intermediate outputs that give confidence that the model is not simply latching onto data artifacts. Learning these models is challeng
Story generation is a task that aims to automatically generate a meaningful story. This task is challenging because it requires high-level understanding of the semantic meaning of sentences and causality of story events. Naivesequence-to-sequence mod