ﻻ يوجد ملخص باللغة العربية
Normally, summary quality measures are compared with quality scores produced by human annotators. A higher correlation with human scores is considered to be a fair indicator of a better measure. We discuss observations that cast doubt on this view. We attempt to show a possibility of an alternative indicator. Given a family of measures, we explore a criterion of selecting the best measure not relying on correlations with human scores. Our observations for the BLANC family of measures suggest that the criterion is universal across very different styles of summaries.
Sentiment analysis provides a useful overview of customer review contents. Many review websites allow a user to enter a summary in addition to a full review. Intuitively, summary information may give additional benefit for review sentiment analysis.
The goal of a summary is to concisely state the most important information in a document. With this principle in mind, we introduce new reference-free summary evaluation metrics that use a pretrained language model to estimate the information shared
The creation of a large summarization quality dataset is a considerable, expensive, time-consuming effort, requiring careful planning and setup. It includes producing human-written and machine-generated summaries and evaluation of the summaries by hu
In text summarization, evaluating the efficacy of automatic metrics without human judgments has become recently popular. One exemplar work concludes that automatic metrics strongly disagree when ranking high-scoring summaries. In this paper, we revis
Significant progress has been made in deep-learning based Automatic Essay Scoring (AES) systems in the past two decades. However, little research has been put to understand and interpret the black-box nature of these deep-learning based scoring model