Do you want to publish a course? Click here

Multiplex Graph Neural Network for Extractive Text Summarization

129   0   0.0 ( 0 )
 Added by Baoyu Jing
 Publication date 2021
and research's language is English




Ask ChatGPT about the research

Extractive text summarization aims at extracting the most representative sentences from a given document as its summary. To extract a good summary from a long text document, sentence embedding plays an important role. Recent studies have leveraged graph neural networks to capture the inter-sentential relationship (e.g., the discourse graph) to learn contextual sentence embedding. However, those approaches neither consider multiple types of inter-sentential relationships (e.g., semantic similarity & natural connection), nor model intra-sentential relationships (e.g, semantic & syntactic relationship among words). To address these problems, we propose a novel Multiplex Graph Convolutional Network (Multi-GCN) to jointly model different types of relationships among sentences and words. Based on Multi-GCN, we propose a Multiplex Graph Summarization (Multi-GraS) model for extractive text summarization. Finally, we evaluate the proposed models on the CNN/DailyMail benchmark dataset to demonstrate the effectiveness of our method.

rate research

Read More

Although domain shift has been well explored in many NLP applications, it still has received little attention in the domain of extractive text summarization. As a result, the model is under-utilizing the nature of the training data due to ignoring the difference in the distribution of training sets and shows poor generalization on the unseen domain. With the above limitation in mind, in this paper, we first extend the conventional definition of the domain from categories into data sources for the text summarization task. Then we re-purpose a multi-domain summarization dataset and verify how the gap between different domains influences the performance of neural summarization models. Furthermore, we investigate four learning strategies and examine their abilities to deal with the domain shift problem. Experimental results on three different settings show their different characteristics in our new testbed. Our source code including textit{BERT-based}, textit{meta-learning} methods for multi-domain summarization learning and the re-purposed dataset textsc{Multi-SUM} will be available on our project: url{http://pfliu.com/TransferSum/}.
Recently, researches have explored the graph neural network (GNN) techniques on text classification, since GNN does well in handling complex structures and preserving global information. However, previous methods based on GNN are mainly faced with the practical problems of fixed corpus level graph structure which do not support online testing and high memory consumption. To tackle the problems, we propose a new GNN based model that builds graphs for each input text with global parameters sharing instead of a single graph for the whole corpus. This method removes the burden of dependence between an individual text and entire corpus which support online testing, but still preserve global information. Besides, we build graphs by much smaller windows in the text, which not only extract more local features but also significantly reduce the edge numbers as well as memory consumption. Experiments show that our model outperforms existing models on several text classification datasets even with consuming less memory.
Most prior work in the sequence-to-sequence paradigm focused on datasets with input sequence lengths in the hundreds of tokens due to the computational constraints of common RNN and Transformer architectures. In this paper, we study long-form abstractive text summarization, a sequence-to-sequence setting with input sequence lengths up to 100,000 tokens and output sequence lengths up to 768 tokens. We propose SEAL, a Transformer-based model, featuring a new encoder-decoder attention that dynamically extracts/selects input snippets to sparsely attend to for each output segment. Using only the original documents and summaries, we derive proxy labels that provide weak supervision for extractive layers simultaneously with regular supervision from abstractive summaries. The SEAL model achieves state-of-the-art results on existing long-form summarization tasks, and outperforms strong baseline models on a new dataset/task we introduce, Search2Wiki, with much longer input text. Since content selection is explicit in the SEAL model, a desirable side effect is that the selection can be inspected for enhanced interpretability.
This article briefly explains our submitted approach to the DocEng19 competition on extractive summarization. We implemented a recurrent neural network based model that learns to classify whether an articles sentence belongs to the corresponding extractive summary or not. We bypass the lack of large annotated news corpora for extractive summarization by generating extractive summaries from abstractive ones, which are available from the CNN corpus.
The recent years have seen remarkable success in the use of deep neural networks on text summarization. However, there is no clear understanding of textit{why} they perform so well, or textit{how} they might be improved. In this paper, we seek to better understand how neural extractive summarization systems could benefit from different types of model architectures, transferable knowledge and learning schemas. Additionally, we find an effective way to improve current frameworks and achieve the state-of-the-art result on CNN/DailyMail by a large margin based on our observations and analyses. Hopefully, our work could provide more clues for future research on extractive summarization.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا