Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Weakly Supervised Extractive Summarization with Attention

تلخيص استخراج إشراف ضعيف مع الاهتمام

609 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

weakly supervised extractive extractive summarization electronic health records الإشراف ضعيف الاستخراج تلخيص الاستخراج سجلات الصحة الإلكترونية صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Automatic summarization aims to extract important information from large amounts of textual data in order to create a shorter version of the original texts while preserving its information. Training traditional extractive summarization models relies heavily on human-engineered labels such as sentence-level annotations of summary-worthiness. However, in many use cases, such human-engineered labels do not exist and manually annotating thousands of documents for the purpose of training models may not be feasible. On the other hand, indirect signals for summarization are often available, such as agent actions for customer service dialogues, headlines for news articles, diagnosis for Electronic Health Records, etc. In this paper, we develop a general framework that generates extractive summarization as a byproduct of supervised learning tasks for indirect signals via the help of attention mechanism. We test our models on customer service dialogues and experimental results demonstrated that our models can reliably select informative sentences and words for automatic summarization.

References used

https://aclanthology.org/

rate research

Keyphrase Extraction from Scientific Articles via Extractive Summarization

712 - Association for Computation Linguistics 2021 مقالة

Automatically extracting keyphrases from scholarly documents leads to a valuable concise representation that humans can understand and machines can process for tasks, such as information retrieval, article clustering and article classification. This paper is concerned with the parts of a scientific article that should be given as input to keyphrase extraction methods. Recent deep learning methods take titles and abstracts as input due to the increased computational complexity in processing long sequences, whereas traditional approaches can also work with full-texts. Titles and abstracts are dense in keyphrases, but often miss important aspects of the articles, while full-texts on the other hand are richer in keyphrases but much noisier. To address this trade-off, we propose the use of extractive summarization models on the full-texts of scholarly documents. Our empirical study on 3 article collections using 3 keyphrase extraction methods shows promising results.

keyphrase extraction keyphrase extraction methods استخراج المفاتيح أساليب استخراج المفاتيح صناعة حمض الفوسفور

Seed Word Selection for Weakly-Supervised Text Classification with Unsupervised Error Estimation

599 - Association for Computation Linguistics 2021 مقالة

Weakly-supervised text classification aims to induce text classifiers from only a few user-provided seed words. The vast majority of previous work assumes high-quality seed words are given. However, the expert-annotated seed words are sometimes non-t rivial to come up with. Furthermore, in the weakly-supervised learning setting, we do not have any labeled document to measure the seed words' efficacy, making the seed word selection process a walk in the dark''. In this work, we remove the need for expert-curated seed words by first mining (noisy) candidate seed words associated with the category names. We then train interim models with individual candidate seed words. Lastly, we estimate the interim models' error rate in an unsupervised manner. The seed words that yield the lowest estimated error rates are added to the final seed word set. A comprehensive evaluation of six binary classification tasks on four popular datasets demonstrates that the proposed method outperforms a baseline using only category name seed words and obtained comparable performance as a counterpart using expert-annotated seed words.

weakly-supervised text classification seed words unsupervised error estimation تصنيف النص ضعيف الإشراف كلمات البذور تقدير الأخطاء غير المزعج صناعة حمض الفوسفور المزيد..

Sliding Selector Network with Dynamic Memory for Extractive Summarization of Long Documents

606 - Association for Computation Linguistics 2021 مقالة

Neural-based summarization models suffer from the length limitation of text encoder. Long documents have to been truncated before they are sent to the model, which results in huge loss of summary-relevant contents. To address this issue, we propose t he sliding selector network with dynamic memory for extractive summarization of long-form documents, which employs a sliding window to extract summary sentences segment by segment. Moreover, we adopt memory mechanism to preserve and update the history information dynamically, allowing the semantic flow across different windows. Experimental results on two large-scale datasets that consist of scientific papers demonstrate that our model substantially outperforms previous state-of-the-art models. Besides, we perform qualitative and quantitative investigations on how our model works and where the performance gain comes from.

sliding selector network selector network انزلاق شبكة محدد شبكة محدد صناعة حمض الفوسفور

HETFORMER: Heterogeneous Transformer with Sparse Attention for Long-Text Extractive Summarization

824 - Association for Computation Linguistics 2021 مقالة

To capture the semantic graph structure from raw text, most existing summarization approaches are built on GNNs with a pre-trained model. However, these methods suffer from cumbersome procedures and inefficient computations for long-text documents. T o mitigate these issues, this paper proposes HetFormer, a Transformer-based pre-trained model with multi-granularity sparse attentions for long-text extractive summarization. Specifically, we model different types of semantic nodes in raw text as a potential heterogeneous graph and directly learn heterogeneous relationships (edges) among nodes by Transformer. Extensive experiments on both single- and multi-document summarization tasks show that HetFormer achieves state-of-the-art performance in Rouge F1 while using less memory and fewer parameters.

long-text extractive summarization long-text extractive تلخيص الاستخراج طويل النص طويل النص الاستخراج صناعة حمض الفوسفور

Learning to Selectively Learn for Weakly-supervised Paraphrase Generation

628 - Association for Computation Linguistics 2021 مقالة

Paraphrase generation is a longstanding NLP task that has diverse applications on downstream NLP tasks. However, the effectiveness of existing efforts predominantly relies on large amounts of golden labeled data. Though unsupervised endeavors have be en proposed to alleviate this issue, they may fail to generate meaningful paraphrases due to the lack of supervision signals. In this work, we go beyond the existing paradigms and propose a novel approach to generate high-quality paraphrases with data of weak supervision. Specifically, we tackle the weakly-supervised paraphrase generation problem by: (1) obtaining abundant weakly-labeled parallel sentences via retrieval-based pseudo paraphrase expansion; and (2) developing a meta-learning framework to progressively select valuable samples for fine-tuning a pre-trained language model BART on the sentential paraphrasing task. We demonstrate that our approach achieves significant improvements over existing unsupervised approaches, and is even comparable in performance with supervised state-of-the-arts.

selectively learn learning to selectively weakly-supervised paraphrase generation تعلم انتقائي توليد إعادة صياغة الإشراف ضعيف صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Weakly Supervised Extractive Summarization with Attention

تلخيص استخراج إشراف ضعيف مع الاهتمام

Ask ChatGPT about the research

Read More

suggested questions