New community

Subscribe to the gold package and get unlimited access to Shamra Academy

RefSum: Refactoring Neural Summarization

Refsum: إعادة صبط التلخيص العصبي

323 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

refactoring neural summarization refactoring neural neural summarization إعادة صبط التلخيص العصبي إعادة صياغة العصبية تلخيص العصبي صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Although some recent works show potential complementarity among different state-of-the-art systems, few works try to investigate this problem in text summarization. Researchers in other areas commonly refer to the techniques of reranking or stacking to approach this problem. In this work, we highlight several limitations of previous methods, which motivates us to present a new framework Refactor that provides a unified view of text summarization and summaries combination. Experimentally, we perform a comprehensive evaluation that involves twenty-two base systems, four datasets, and three different application scenarios. Besides new state-of-the-art results on CNN/DailyMail dataset (46.18 ROUGE-1), we also elaborate on how our proposed method addresses the limitations of the traditional methods and the effectiveness of the Refactor model sheds light on insight for performance improvement. Our system can be directly used by other researchers as an off-the-shelf tool to achieve further performance improvements. We open-source all the code and provide a convenient interface to use it: https://github.com/yixinL7/Refactoring-Summarization.

References used

https://aclanthology.org/

rate research

SummEval: Re-evaluating Summarization Evaluation

342 - Association for Computation Linguistics 2021 مقالة

Abstract The scarcity of comprehensive up-to-date studies on evaluation metrics for text summarization and the lack of consensus regarding evaluation protocols continue to inhibit progress. We address the existing shortcomings of summarization evalua tion methods along five dimensions: 1) we re-evaluate 14 automatic evaluation metrics in a comprehensive and consistent fashion using neural summarization model outputs along with expert and crowd-sourced human annotations; 2) we consistently benchmark 23 recent summarization models using the aforementioned automatic evaluation metrics; 3) we assemble the largest collection of summaries generated by models trained on the CNN/DailyMail news dataset and share it in a unified format; 4) we implement and share a toolkit that provides an extensible and unified API for evaluating summarization models across a broad range of automatic metrics; and 5) we assemble and share the largest and most diverse, in terms of model types, collection of human judgments of model-generated summaries on the CNN/Daily Mail dataset annotated by both expert judges and crowd-source workers. We hope that this work will help promote a more complete evaluation protocol for text summarization as well as advance research in developing evaluation metrics that better correlate with human judgments.

re-evaluating summarization evaluation re-evaluating summarization evaluation metrics إعادة تقييم تقييم التلخيص إعادة تقييم التلخيص مقاييس التقييم صناعة حمض الفوسفور المزيد..

A Statistical Analysis of Summarization Evaluation Metrics Using Resampling Methods

285 - Association for Computation Linguistics 2021 مقالة

Abstract The quality of a summarization evaluation metric is quantified by calculating the correlation between its scores and human annotations across a large number of summaries. Currently, it is unclear how precise these correlation estimates are, nor whether differences between two metrics' correlations reflect a true difference or if it is due to mere chance. In this work, we address these two problems by proposing methods for calculating confidence intervals and running hypothesis tests for correlations using two resampling methods, bootstrapping and permutation. After evaluating which of the proposed methods is most appropriate for summarization through two simulation experiments, we analyze the results of applying these methods to several different automatic evaluation metrics across three sets of human annotations. We find that the confidence intervals are rather wide, demonstrating high uncertainty in the reliability of automatic metrics. Further, although many metrics fail to show statistical improvements over ROUGE, two recent works, QAEval and BERTScore, do so in some evaluation settings.1

مجموعات البيانات الإنجليزية الحالية summarization evaluation summarization evaluation metrics تقييم تلخيص مقاييس تقييم تلخيص صناعة حمض الفوسفور

Controllable Neural Dialogue Summarization with Personal Named Entity Planning

418 - Association for Computation Linguistics 2021 مقالة

In this paper, we propose a controllable neural generation framework that can flexibly guide dialogue summarization with personal named entity planning. The conditional sequences are modulated to decide what types of information or what perspective t o focus on when forming summaries to tackle the under-constrained problem in summarization tasks. This framework supports two types of use cases: (1) Comprehensive Perspective, which is a general-purpose case with no user-preference specified, considering summary points from all conversational interlocutors and all mentioned persons; (2) Focus Perspective, positioning the summary based on a user-specified personal named entity, which could be one of the interlocutors or one of the persons mentioned in the conversation. During training, we exploit occurrence planning of personal named entities and coreference information to improve temporal coherence and to minimize hallucination in neural generation. Experimental results show that our proposed framework generates fluent and factually consistent summaries under various planning controls using both objective metrics and human evaluations.

personal named entity named entity planning neural dialogue summarization الكيان المسمى الشخصي اسمه كيان التخطيط تلخيص الحوار العصبي صناعة حمض الفوسفور المزيد..

Topic-Guided Abstractive Multi-Document Summarization

554 - Association for Computation Linguistics 2021 مقالة

A critical point of multi-document summarization (MDS) is to learn the relations among various documents. In this paper, we propose a novel abstractive MDS model, in which we represent multiple documents as a heterogeneous graph, taking semantic node s of different granularities into account, and then apply a graph-to-sequence framework to generate summaries. Moreover, we employ a neural topic model to jointly discover latent topics that can act as cross-document semantic units to bridge different documents and provide global information to guide the summary generation. Since topic extraction can be viewed as a special type of summarization that summarizes'' texts into a more abstract format, i.e., a topic distribution, we adopt a multi-task learning strategy to jointly train the topic and summarization module, allowing the promotion of each other. Experimental results on the Multi-News dataset demonstrate that our model outperforms previous state-of-the-art MDS models on both Rouge scores and human evaluation, meanwhile learns high-quality topics.

topic-guided abstractive multi-document abstractive multi-document summarization abstractive mds model المبادرة متعددة المدى متعدد الوثائق مخبأ متعدد الوثائق تلخص مبادرة MDS نموذج صناعة حمض الفوسفور المزيد..

Evaluation of Summarization Systems across Gender, Age, and Race

244 - Association for Computation Linguistics 2021 مقالة

Summarization systems are ultimately evaluated by human annotators and raters. Usually, annotators and raters do not reflect the demographics of end users, but are recruited through student populations or crowdsourcing platforms with skewed demograph ics. For two different evaluation scenarios -- evaluation against gold summaries and system output ratings -- we show that summary evaluation is sensitive to protected attributes. This can severely bias system development and evaluation, leading us to build models that cater for some groups rather than others.

summarization systems أنظمة تلخيص سن صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

RefSum: Refactoring Neural Summarization

Refsum: إعادة صبط التلخيص العصبي

Ask ChatGPT about the research

Read More

suggested questions