تركز معظم مهمة التلخيص على توليد ملخصات قصيرة نسبيا.قد لا يكون هذا القيد الطول مناسبا عند تلخيص العمل العلمي.يحتاج المهمة LongsUMS إلى المشاركين الذين يولدون ملخصا طويلا للمستند العلمي.هذه المهمة المعتادة يمكن حلها حسب نموذج اللغة.ولكن هناك مشكلة مهمة هي أن النموذج مثل بيرت هو الحد من الذاكرة، ولا يمكن التعامل مع إدخال طويل مثل المستند.أيضا توليد إخراج طويل أمر صعب.في هذه الورقة، نقترح نموذجا موجزا في الدورة (SBAs) باستخدام جلسة وآلية فرقة لتوليد ملخص طويل.ونموذجنا يحصل على أفضل أداء في مهمة Longsumm.
Most summarization task focuses on generating relatively short summaries. Such a length constraint might not be appropriate when summarizing scientific work. The LongSumm task needs participants generate long summary for scientific document. This task usual can be solved by language model. But an important problem is that model like BERT is limit to memory, and can not deal with a long input like a document. Also generate a long output is hard. In this paper, we propose a session based automatic summarization model(SBAS) which using a session and ensemble mechanism to generate long summary. And our model achieves the best performance in the LongSumm task.
References used
https://aclanthology.org/
Presentations are critical for communication in all areas of our lives, yet the creation of slide decks is often tedious and time-consuming. There has been limited research aiming to automate the document-to-slides generation process and all face a c
Data sharing restrictions are common in NLP datasets. The purpose of this task is to develop a model trained in a source domain to make predictions for a target domain with related domain data. To address the issue, the organizers provided the models
This paper presents an unsupervised extractive approach to summarize scientific long documents based on the Information Bottleneck principle. Inspired by previous work which uses the Information Bottleneck principle for sentence compression, we exten
Large pretrained models have seen enormous success in extractive summarization tasks. In this work, we investigate the influence of pretraining on a BERT-based extractive summarization system for scientific documents. We derive significant performanc
The amount of information available online can be overwhelming for users to digest, specially when dealing with other users' comments when making a decision about buying a product or service. In this context, opinion summarization systems are of grea