New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Abstractive Document Summarization with Word Embedding Reconstruction

تلخيص وثائق المبادرة مع كلمة إعادة بناء كلمة

311 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

abstractive document summarization document summarization word embedding reconstruction ملخص وثيقة الجماع تلخيص الوثائق كلمة تضمين إعادة الإعمار صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Neural sequence-to-sequence (Seq2Seq) models and BERT have achieved substantial improvements in abstractive document summarization (ADS) without and with pre-training, respectively. However, they sometimes repeatedly attend to unimportant source phrases while mistakenly ignore important ones. We present reconstruction mechanisms on two levels to alleviate this issue. The sequence-level reconstructor reconstructs the whole document from the hidden layer of the target summary, while the word embedding-level one rebuilds the average of word embeddings of the source at the target side to guarantee that as much critical information is included in the summary as possible. Based on the assumption that inverse document frequency (IDF) measures how important a word is, we further leverage the IDF weights in our embedding-level reconstructor. The proposed frameworks lead to promising improvements for ROUGE metrics and human rating on both the CNN/Daily Mail and Newsroom summarization datasets.

References used

https://aclanthology.org/

rate research

Block-wise Word Embedding Compression Revisited: Better Weighting and Structuring

406 - Association for Computation Linguistics 2021 مقالة

Word embedding is essential for neural network models for various natural language processing tasks. Since the word embedding usually has a considerable size, in order to deploy a neural network model having it on edge devices, it should be effective ly compressed. There was a study for proposing a block-wise low-rank approximation method for word embedding, called GroupReduce. Even if their structure is effective, the properties behind the concept of the block-wise word embedding compression were not sufficiently explored. Motivated by this, we improve GroupReduce in terms of word weighting and structuring. For word weighting, we propose a simple yet effective method inspired by the term frequency-inverse document frequency method and a novel differentiable method. Based on them, we construct a discriminative word embedding compression algorithm. In the experiments, we demonstrate that the proposed algorithm more effectively finds word weights than competitors in most cases. In addition, we show that the proposed algorithm can act like a framework through successful cooperation with quantization.

embedding compression revisited إعادة النظر في ضغط التضمين كلمة تضمين صناعة حمض الفوسفور

A Comparative Study on Abstractive and Extractive Approaches in Summarization of European Legislation Documents

799 - Association for Computation Linguistics 2021 مقالة

Extracting the most important part of legislation documents has great business value because the texts are usually very long and hard to understand. The aim of this article is to evaluate different algorithms for text summarization on EU legislation documents. The content contains domain-specific words. We collected a text summarization dataset of EU legal documents consisting of 1563 documents, in which the mean length of summaries is 424 words. Experiments were conducted with different algorithms using the new dataset. A simple extractive algorithm was selected as a baseline. Advanced extractive algorithms, which use encoders show better results than baseline. The best result measured by ROUGE scores was achieved by a fine-tuned abstractive T5 model, which was adapted to work with long texts.

وصف تحليلي european legislation documents european legislation وثائق التشريعات الأوروبية التشريع الأوروبي صناعة حمض الفوسفور

A Contextual Word Embedding for Arabic Sarcasm Detection with Random Forests

458 - Association for Computation Linguistics 2021 مقالة

Sarcasm detection is of great importance in understanding people's true sentiments and opinions. Many online feedbacks, reviews, social media comments, etc. are sarcastic. Several researches have already been done in this field, but most researchers studied the English sarcasm analysis compared to the researches are done in Arabic sarcasm analysis because of the Arabic language challenges. In this paper, we propose a new approach for improving Arabic sarcasm detection. Our approach is using data augmentation, contextual word embedding and random forests model to get the best results. Our accuracy in the shared task on sarcasm and sentiment detection in Arabic was 0.5189 for F1-sarcastic as the official metric using the shared dataset ArSarcasmV2 (Abu Farha, et al., 2021).

الجنس العربي contextual word embedding arabic sarcasm كلمة التضمين السياقي السخرية العربية صناعة حمض الفوسفور

Improving Abstractive Summarization with Commonsense Knowledge

263 - Association for Computation Linguistics 2021 مقالة

Large scale pretrained models have demonstrated strong performances on several natural language generation and understanding benchmarks. However, introducing commonsense into them to generate more realistic text remains a challenge. Inspired from pre vious work on commonsense knowledge generation and generative commonsense reasoning, we introduce two methods to add commonsense reasoning skills and knowledge into abstractive summarization models. Both methods beat the baseline on ROUGE scores, demonstrating the superiority of our models over the baseline. Human evaluation results suggest that summaries generated by our methods are more realistic and have fewer commonsensical errors.

improving abstractive summarization improving abstractive تحسين تلخيص الجماعي تحسين المبادرة صناعة حمض الفوسفور

Overcoming Poor Word Embeddings with Word Definitions

450 - Association for Computation Linguistics 2021 مقالة

Modern natural language understanding models depend on pretrained subword embeddings, but applications may need to reason about words that were never or rarely seen during pretraining. We show that examples that depend critically on a rarer word are more challenging for natural language inference models. Then we explore how a model could learn to use definitions, provided in natural text, to overcome this handicap. Our model's understanding of a definition is usually weaker than a well-modeled word embedding, but it recovers most of the performance gap from using a completely untrained word.

overcoming poor word poor word embeddings overcoming poor التغلب على كلمة سيئة سوء الكلمة embeddings. التغلب على الفقراء صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Abstractive Document Summarization with Word Embedding Reconstruction

تلخيص وثائق المبادرة مع كلمة إعادة بناء كلمة

Ask ChatGPT about the research

Read More

suggested questions