Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

TUDA-Reproducibility @ ReproGen: Replicability of Human Evaluation of Text-to-Text and Concept-to-Text Generation

تودا-استسعة @ Reprogen: إعادة التقرير من التقييم البشري لجيل النص إلى النص إلى النص

1291 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

تمرير كرة القدم باللغة الهولندية shared task reprogen human evaluation مهمة مشتركة التقييم البشري صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper describes our contribution to the Shared Task ReproGen by Belz et al. (2021), which investigates the reproducibility of human evaluations in the context of Natural Language Generation. We selected the paper Generation of Company descriptions using concept-to-text and text-to-text deep models: data set collection and systems evaluation'' (Qader et al., 2018) and aimed to replicate, as closely to the original as possible, the human evaluation and the subsequent comparison between the human judgements and the automatic evaluation metrics. Here, we first outline the text generation task of the paper of Qader et al. (2018). Then, we document how we approached our replication of the paper's human evaluation. We also discuss the difficulties we encountered and which information was missing. Our replication has medium to strong correlation (0.66 Spearman overall) with the original results of Qader et al. (2018), but due to the missing information about how Qader et al. (2018) compared the human judgements with the metric scores, we have refrained from reproducing this comparison.

References used

https://aclanthology.org/

rate research

Extract, Denoise and Enforce: Evaluating and Improving Concept Preservation for Text-to-Text Generation

1019 - Association for Computation Linguistics 2021 مقالة

Prior studies on text-to-text generation typically assume that the model could figure out what to attend to in the input and what to include in the output via seq2seq learning, with only the parallel training data and no additional guidance. However, it remains unclear whether current models can preserve important concepts in the source input, as seq2seq learning does not have explicit focus on the concepts and commonly used evaluation metrics also treat them equally important as other tokens. In this paper, we present a systematic analysis that studies whether current seq2seq models, especially pre-trained language models, are good enough for preserving important input concepts and to what extent explicitly guiding generation with the concepts as lexical constraints is beneficial. We answer the above questions by conducting extensive analytical experiments on four representative text-to-text generation tasks. Based on the observations, we then propose a simple yet effective framework to automatically extract, denoise, and enforce important input concepts as lexical constraints. This new method performs comparably or better than its unconstrained counterpart on automatic metrics, demonstrates higher coverage for concept preservation, and receives better ratings in the human evaluation. Our code is available at https://github.com/morningmoni/EDE.

evaluating and improving improving concept preservation improving concept تقييم وتحسين تحسين الحفاظ على المفهوم تحسين المفهوم صناعة حمض الفوسفور المزيد..

BTS: Back TranScription for Speech-to-Text Post-Processor using Text-to-Speech-to-Text

739 - Association for Computation Linguistics 2021 مقالة

With the growing popularity of smart speakers, such as Amazon Alexa, speech is becoming one of the most important modes of human-computer interaction. Automatic speech recognition (ASR) is arguably the most critical component of such systems, as erro rs in speech recognition propagate to the downstream components and drastically degrade the user experience. A simple and effective way to improve the speech recognition accuracy is to apply automatic post-processor to the recognition result. However, training a post-processor requires parallel corpora created by human annotators, which are expensive and not scalable. To alleviate this problem, we propose Back TranScription (BTS), a denoising-based method that can create such corpora without human labor. Using a raw corpus, BTS corrupts the text using Text-to-Speech (TTS) and Speech-to-Text (STT) systems. Then, a post-processing model can be trained to reconstruct the original text given the corrupted input. Quantitative and qualitative evaluations show that a post-processor trained using our approach is highly effective in fixing non-trivial speech recognition errors such as mishandling foreign words. We present the generated parallel corpus and post-processing platform to make our results publicly available.

النماذج المدربة مسبقا amazon alexa back transcription الأمازون اليكسا النسخ الخلفي صناعة حمض الفوسفور

SAPPHIRE: Approaches for Enhanced Concept-to-Text Generation

861 - Association for Computation Linguistics 2021 مقالة

We motivate and propose a suite of simple but effective improvements for concept-to-text generation called SAPPHIRE: Set Augmentation and Post-hoc PHrase Infilling and REcombination. We demonstrate their effectiveness on generative commonsense reason ing, a.k.a. the CommonGen task, through experiments using both BART and T5 models. Through extensive automatic and human evaluation, we show that SAPPHIRE noticeably improves model performance. An in-depth qualitative analysis illustrates that SAPPHIRE effectively addresses many issues of the baseline model generations, including lack of commonsense, insufficient specificity, and poor fluency.

approaches for enhanced post-hoc phrase infilling generation called sapphire نهج لتعزيز عبارة ما بعد الهي جيل يسمى الياقوت صناعة حمض الفوسفور المزيد..

From Sentence Syntax to Text Syntax Concept and Practice

4403 - Aِl-Baath University 2017 ورقة بحثية

This research shows the concept of sentence syntax and the text syntax and the difference between them, beside their respective areas .It also tries to specify the obstacles which prevent the progress of this kind of linguistic lesson in our Arabi an collages .Then it stops at the trends of linguistic studies where such kind of linguistic lesson appears .Also tries to monitor the reality of this lingual lesson in the Syrian collages through one sample ,that is Al Baath University .Finally finishes by the most important recommendations which can contribute in developing this kind of lingual lesson .

نحو الجملة نحو النص Sentence Syntax Text Syntax

Evaluation Guidelines to Deal with Implicit Phenomena to Assess Factuality in Data-to-Text Generation

1161 - Association for Computation Linguistics 2021 مقالة

Data-to-text generation systems are trained on large datasets, such as WebNLG, Ro-toWire, E2E or DART. Beyond traditional token-overlap evaluation metrics (BLEU or METEOR), a key concern faced by recent generators is to control the factuality of the generated text with respect to the input data specification. We report on our experience when developing an automatic factuality evaluation system for data-to-text generation that we are testing on WebNLG and E2E data. We aim to prepare gold data annotated manually to identify cases where the text communicates more information than is warranted based on the in-put data (extra) or fails to communicate data that is part of the input (missing). While analyzing reference (data, text) samples, we encountered a range of systematic uncertainties that are related to cases on implicit phenomena in text, and the nature of non-linguistic knowledge we expect to be involved when assessing factuality. We derive from our experience a set of evaluation guidelines to reach high inter-annotator agreement on such cases.

guidelines to deal assess factuality phenomena to assess مبادئ توجيهية للتعامل تقييم التوظيف الظواهر لتقييم صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

TUDA-Reproducibility @ ReproGen: Replicability of Human Evaluation of Text-to-Text and Concept-to-Text Generation

تودا-استسعة @ Reprogen: إعادة التقرير من التقييم البشري لجيل النص إلى النص إلى النص

Ask ChatGPT about the research

Read More

suggested questions