Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Underreporting of errors in NLG output, and what to do about it

عدم الإبلاغ عن الأخطاء في إخراج NLG، وماذا تفعل حيال ذلك

339 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We observe a severe under-reporting of the different kinds of errors that Natural Language Generation systems make. This is a problem, because mistakes are an important indicator of where systems should still be improved. If authors only report overall performance metrics, the research community is left in the dark about the specific weaknesses that are exhibited by state-of-the-art' research. Next to quantifying the extent of error under-reporting, this position paper provides recommendations for error identification, analysis and reporting.

References used

https://aclanthology.org/

rate research

`Just What do You Think You're Doing, Dave?' A Checklist for Responsible Data Use in NLP

680 - Association for Computation Linguistics 2021 مقالة

A key part of the NLP ethics movement is responsible use of data, but exactly what that means or how it can be best achieved remain unclear. This position paper discusses the core legal and ethical principles for collection and sharing of textual dat a, and the tensions between them. We propose a potential checklist for responsible data (re-)use that could both standardise the peer review of conference submissions, as well as enable a more in-depth view of published research across the community. Our proposal aims to contribute to the development of a consistent standard for data (re-)use, embraced across NLP conferences.

dave responsible data ديف البيانات المسؤولة صناعة حمض الفوسفور

It's Commonsense, isn't it? Demystifying Human Evaluations in Commonsense-Enhanced NLG Systems

836 - Association for Computation Linguistics 2021 مقالة

Common sense is an integral part of human cognition which allows us to make sound decisions, communicate effectively with others and interpret situations and utterances. Endowing AI systems with commonsense knowledge capabilities will help us get clo ser to creating systems that exhibit human intelligence. Recent efforts in Natural Language Generation (NLG) have focused on incorporating commonsense knowledge through large-scale pre-trained language models or by incorporating external knowledge bases. Such systems exhibit reasoning capabilities without common sense being explicitly encoded in the training set. These systems require careful evaluation, as they incorporate additional resources during training which adds additional sources of errors. Additionally, human evaluation of such systems can have significant variation, making it impossible to compare different systems and define baselines. This paper aims to demystify human evaluations of commonsense-enhanced NLG systems by proposing the Commonsense Evaluation Card (CEC), a set of recommendations for evaluation reporting of commonsense-enhanced NLG systems, underpinned by an extensive analysis of human evaluations reported in the recent literature.

commonsense-enhanced nlg systems commonsense-enhanced nlg nlg systems نظم NLG المحسنة للعمليات المنطقية المحسنة NLG أنظمة NLG. صناعة حمض الفوسفور المزيد..

Informed Sampling for Diversity in Concept-to-Text NLG

789 - Association for Computation Linguistics 2021 مقالة

Deep-learning models for language generation tasks tend to produce repetitive output. Various methods have been proposed to encourage lexical diversity during decoding, but this often comes at a cost to the perceived fluency and adequacy of the outpu t. In this work, we propose to ameliorate this cost by using an Imitation Learning approach to explore the level of diversity that a language generation model can reliably produce. Specifically, we augment the decoding process with a meta-classifier trained to distinguish which words at any given timestep will lead to high-quality output. We focus our experiments on concept-to-text generation where models are sensitive to the inclusion of irrelevant words due to the strict relation between input and output. Our analysis shows that previous methods for diversity underperform in this setting, while human evaluation suggests that our proposed method achieves a high level of diversity with minimal effect on the output's fluency and adequacy.

informed sampling sampling for diversity أخذ العينات المستنيرة أخذ العينات للتنوع صناعة حمض الفوسفور

Refocusing on Relevance: Personalization in NLG

665 - Association for Computation Linguistics 2021 مقالة

Many NLG tasks such as summarization, dialogue response, or open domain question answering, focus primarily on a source text in order to generate a target response. This standard approach falls short, however, when a user's intent or context of work is not easily recoverable based solely on that source text-- a scenario that we argue is more of the rule than the exception. In this work, we argue that NLG systems in general should place a much higher level of emphasis on making use of additional context, and suggest that relevance (as used in Information Retrieval) be thought of as a crucial tool for designing user-oriented text-generating tasks. We further discuss possible harms and hazards around such personalization, and argue that value-sensitive design represents a crucial path forward through these challenges.

التحكم في إعادة صياغة النص source text refocusing on relevance النص المصدر إعادة تركيزه حسب الصلة صناعة حمض الفوسفور

Reproducing a Comparison of Hedged and Non-hedged NLG Texts

744 - Association for Computation Linguistics 2021 مقالة

This paper describes an attempt to reproduce an earlier experiment, previously conducted by the author, that compares hedged and non-hedged NLG texts as part of the ReproGen shared challenge. This reproduction effort was only able to partially replic ate results from the original study. The analyisis from this reproduction effort suggests that whilst it is possible to replicate the procedural aspects of a previous study, replicating the results can prove more challenging as differences in participant type can have a potential impact.

non-hedged nlg texts non-hedged nlg nlg texts نصوص nlg غير التحوط غير متحولة nlg نصوص nlg. صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Underreporting of errors in NLG output, and what to do about it

عدم الإبلاغ عن الأخطاء في إخراج NLG، وماذا تفعل حيال ذلك

Ask ChatGPT about the research

Read More

suggested questions