الجيل القليل من طاولة النصوص إلى النص هو مهمة تأليف الجمل الطلالية والمخمة لنقل محتوى الجدول باستخدام بيانات محدودة. على الرغم من الجهود التي بذلت العديد من الجهود نحو توليد جمل بطلاقة مثيرة للإعجاب من خلال ضبط طرازات لغة قوية مدربة مسبقا، لا يزال بإصلاح المحتوى الذي تم إنشاؤه يحتاج إلى تحسين. تحقيقا لهذه الغاية، تقترح هذه الورقة نهجا جديدا يحضر، وحفظ وتوليد (يسمى AMG)، مستوحاة من عملية توليد النص للبشر. على وجه الخصوص. (2) يحفظ ديناميكيا حالات مخصصات فتحة الجدول؛ و (3) يولد جمل مخلصة وفقا لحالات سياق وتخصيص الذاكرة. تشير التجارب الشاملة إلى التقييم البشري على ثلاثة مجالات (أي البشر، الأغاني، والكتب) من مجموعة بيانات الويكي أن نموذجنا يمكن أن يولد نصوص مؤهلة أعلى عند مقارنتها مع العديد من خطوط الأساس الحديثة، في كل من الطلاقة والإخلاص.
Few-shot table-to-text generation is a task of composing fluent and faithful sentences to convey table content using limited data. Despite many efforts having been made towards generating impressive fluent sentences by fine-tuning powerful pre-trained language models, the faithfulness of generated content still needs to be improved. To this end, this paper proposes a novel approach Attend, Memorize and Generate (called AMG), inspired by the text generation process of humans. In particular, AMG (1) attends over the multi-granularity of context using a novel strategy based on table slot level and traditional token-by-token level attention to exploit both the table structure and natural linguistic information; (2) dynamically memorizes the table slot allocation states; and (3) generates faithful sentences according to both the context and memory allocation states. Comprehensive experiments with human evaluation on three domains (i.e., humans, songs, and books) of the Wiki dataset show that our model can generate higher qualified texts when compared with several state-of-the-art baselines, in both fluency and faithfulness.
References used
https://aclanthology.org/
Neural table-to-text generation models have achieved remarkable progress on an array of tasks. However, due to the data-hungry nature of neural models, their performances strongly rely on large-scale training examples, limiting their applicability in
This paper describes our contribution to the Shared Task ReproGen by Belz et al. (2021), which investigates the reproducibility of human evaluations in the context of Natural Language Generation. We selected the paper Generation of Company descriptio
We present DART, an open domain structured DAta Record to Text generation dataset with over 82k instances (DARTs). Data-to-text annotations can be a costly process, especially when dealing with tables which are the major source of structured data and
The analytical description of charts is an exciting and important research area with many applications in academia and industry. Yet, this challenging task has received limited attention from the computational linguistics research community. This pap
With the growing popularity of smart speakers, such as Amazon Alexa, speech is becoming one of the most important modes of human-computer interaction. Automatic speech recognition (ASR) is arguably the most critical component of such systems, as erro