ترغب بنشر مسار تعليمي؟ اضغط هنا

A New Approach to Overgenerating and Scoring Abstractive Summaries

74   0   0.0 ( 0 )
 نشر من قبل Fei Liu
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

We propose a new approach to generate multiple variants of the target summary with diverse content and varying lengths, then score and select admissible ones according to users needs. Abstractive summarizers trained on single reference summaries may struggle to produce outputs that achieve multiple desirable properties, i.e., capturing the most important information, being faithful to the original, grammatical and fluent. In this paper, we propose a two-staged strategy to generate a diverse set of candidate summaries from the source text in stage one, then score and select admissible ones in stage two. Importantly, our generator gives a precise control over the length of the summary, which is especially well-suited when space is limited. Our selectors are designed to predict the optimal summary length and put special emphasis on faithfulness to the original text. Both stages can be effectively trained, optimized and evaluated. Our experiments on benchmark summarization datasets suggest that this paradigm can achieve state-of-the-art performance.



قيم البحث

اقرأ أيضاً

This work presents a new approach to unsupervised abstractive summarization based on maximizing a combination of coverage and fluency for a given length constraint. It introduces a novel method that encourages the inclusion of key terms from the orig inal document into the summary: key terms are masked out of the original document and must be filled in by a coverage model using the current generated summary. A novel unsupervised training procedure leverages this coverage model along with a fluency model to generate and score summaries. When tested on popular news summarization datasets, the method outperforms previous unsupervised methods by more than 2 R-1 points, and approaches results of competitive supervised methods. Our model attains higher levels of abstraction with copied passages roughly two times shorter than prior work, and learns to compress and merge sentences without supervision.
Interleaved texts, where posts belonging to different threads occur in one sequence, are a common occurrence, e.g., online chat conversations. To quickly obtain an overview of such texts, existing systems first disentangle the posts by threads and th en extract summaries from those threads. The major issues with such systems are error propagation and non-fluent summary. To address those, we propose an end-to-end trainable hierarchical encoder-decoder system. We also introduce a novel hierarchical attention mechanism which combines three levels of information from an interleaved text, i.e, posts, phrases and words, and implicitly disentangles the threads. We evaluated the proposed system on multiple interleaved text datasets, and it out-performs a SOTA two-step system by 20-40%.
87 - Shen Gao , Xiuying Chen , Piji Li 2019
Under special circumstances, summaries should conform to a particular style with patterns, such as court judgments and abstracts in academic papers. To this end, the prototype document-summary pairs can be utilized to generate better summaries. There are two main challenges in this task: (1) the model needs to incorporate learned patterns from the prototype, but (2) should avoid copying contents other than the patternized words---such as irrelevant facts---into the generated summaries. To tackle these challenges, we design a model named Prototype Editing based Summary Generator (PESG). PESG first learns summary patterns and prototype facts by analyzing the correlation between a prototype document and its summary. Prototype facts are then utilized to help extract facts from the input document. Next, an editing generator generates new summary based on the summary pattern or extracted facts. Finally, to address the second challenge, a fact checker is used to estimate mutual information between the input document and generated summary, providing an additional signal for the generator. Extensive experiments conducted on a large-scale real-world text summarization dataset show that PESG achieves the state-of-the-art performance in terms of both automatic metrics and human evaluations.
Podcast summarization is different from summarization of other data formats, such as news, patents, and scientific papers in that podcasts are often longer, conversational, colloquial, and full of sponsorship and advertising information, which impose s great challenges for existing models. In this paper, we focus on abstractive podcast summarization and propose a two-phase approach: sentence selection and seq2seq learning. Specifically, we first select important sentences from the noisy long podcast transcripts. The selection is based on sentence similarity to the reference to reduce the redundancy and the associated latent topics to preserve semantics. Then the selected sentences are fed into a pre-trained encoder-decoder framework for the summary generation. Our approach achieves promising results regarding both ROUGE-based measures and human evaluations.
We introduce a new approach for abstractive text summarization, Topic-Guided Abstractive Summarization, which calibrates long-range dependencies from topic-level features with globally salient content. The idea is to incorporate neural topic modeling with a Transformer-based sequence-to-sequence (seq2seq) model in a joint learning framework. This design can learn and preserve the global semantics of the document, which can provide additional contextual guidance for capturing important ideas of the document, thereby enhancing the generation of summary. We conduct extensive experiments on two datasets and the results show that our proposed model outperforms many extractive and abstractive systems in terms of both ROUGE measurements and human evaluation. Our code is available at: https://github.com/chz816/tas.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا