Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Hie-BART: Document Summarization with Hierarchical BART

هي بارت: ملخص الوثيقة مع بارت هرمي

337 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper proposes a new abstractive document summarization model, hierarchical BART (Hie-BART), which captures hierarchical structures of a document (i.e., sentence-word structures) in the BART model. Although the existing BART model has achieved a state-of-the-art performance on document summarization tasks, the model does not have the interactions between sentence-level information and word-level information. In machine translation tasks, the performance of neural machine translation models has been improved by incorporating multi-granularity self-attention (MG-SA), which captures the relationships between words and phrases. Inspired by the previous work, the proposed Hie-BART model incorporates MG-SA into the encoder of the BART model for capturing sentence-word structures. Evaluations on the CNN/Daily Mail dataset show that the proposed Hie-BART model outperforms some strong baselines and improves the performance of a non-hierarchical BART model (+0.23 ROUGE-L).

References used

https://aclanthology.org/

rate research

Error Analysis of using BART for Multi-Document Summarization: A Study for English and German Language

349 - Association for Computation Linguistics 2021 مقالة

Recent research using pre-trained language models for multi-document summarization task lacks deep investigation of potential erroneous cases and their possible application on other languages. In this work, we apply a pre-trained language model (BART ) for multi-document summarization (MDS) task using both fine-tuning and without fine-tuning. We use two English datasets and one German dataset for this study. First, we reproduce the multi-document summaries for English language by following one of the recent studies. Next, we show the applicability of the model to German language by achieving state-of-the-art performance on German MDS. We perform an in-depth error analysis of the followed approach for both languages, which leads us to identifying most notable errors, from made-up facts and topic delimitation, and quantifying the amount of extractiveness.

نماذج اللغة الأم multi-document summarization task german language مهمة تلخيص المستندات متعددة الوثائق اللغة الالمانية صناعة حمض الفوسفور

TMU NMT System with Japanese BART for the Patent task of WAT 2021

380 - Association for Computation Linguistics 2021 مقالة

In this paper, we introduce our TMU Neural Machine Translation (NMT) system submitted for the Patent task (Korean Japanese and English Japanese) of 8th Workshop on Asian Translation (Nakazawa et al., 2021). Recently, several studies proposed pre-trai ned encoder-decoder models using monolingual data. One of the pre-trained models, BART (Lewis et al., 2020), was shown to improve translation accuracy via fine-tuning with bilingual data. However, they experimented only Romanian!English translation using English BART. In this paper, we examine the effectiveness of Japanese BART using Japan Patent Office Corpus 2.0. Our experiments indicate that Japanese BART can also improve translation accuracy in both Korean Japanese and English Japanese translations.

tmu nmt system tmu neural machine tmu nmt نظام TMU NMT. TMU الآلة العصبية TMU NMT. صناعة حمض الفوسفور المزيد..

WVOQ at SemEval-2021 Task 6: BART for Span Detection and Classification

597 - Association for Computation Linguistics 2021 مقالة

Simultaneous span detection and classification is a task not currently addressed in standard NLP frameworks. The present paper describes why and how an EncoderDecoder model was used to combine span detection and classification to address subtask 2 of SemEval-2021 Task 6.

span detection detection and classification simultaneous span detection سبان الكشف الكشف والتصنيف اكتشاف سبان في وقت واحد. صناعة حمض الفوسفور المزيد..

Improving Abstractive Dialogue Summarization with Hierarchical Pretraining and Topic Segment

340 - Association for Computation Linguistics 2021 مقالة

With the increasing abundance of meeting transcripts, meeting summary has attracted more and more attention from researchers. The unsupervised pre-training method based on transformer structure combined with fine-tuning of downstream tasks has achiev ed great success in the field of text summarization. However, the semantic structure and style of meeting transcripts are quite different from that of articles. In this work, we propose a hierarchical transformer encoder-decoder network with multi-task pre-training. Specifically, we mask key sentences at the word-level encoder and generate them at the decoder. Besides, we randomly mask some of the role alignments in the input text and force the model to recover the original role tags to complete the alignments. In addition, we introduce a topic segmentation mechanism to further improve the quality of the generated summaries. The experimental results show that our model is superior to the previous methods in meeting summary datasets AMI and ICSI.

improving abstractive dialogue تحسين الحوار الجماعي صناعة حمض الفوسفور

Leveraging Information Bottleneck for Scientific Document Summarization

756 - Association for Computation Linguistics 2021 مقالة

This paper presents an unsupervised extractive approach to summarize scientific long documents based on the Information Bottleneck principle. Inspired by previous work which uses the Information Bottleneck principle for sentence compression, we exten d it to document level summarization with two separate steps. In the first step, we use signal(s) as queries to retrieve the key content from the source document. Then, a pre-trained language model conducts further sentence search and edit to return the final extracted summaries. Importantly, our work can be flexibly extended to a multi-view framework by different signals. Automatic evaluation on three scientific document datasets verifies the effectiveness of the proposed framework. The further human evaluation suggests that the extracted summaries cover more content aspects than previous systems.

leveraging information bottleneck الاستفادة من المعلومات عنق الزجاجة صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Hie-BART: Document Summarization with Hierarchical BART

هي بارت: ملخص الوثيقة مع بارت هرمي

Ask ChatGPT about the research

Read More

suggested questions