ترغب بنشر مسار تعليمي؟ اضغط هنا

DivGraphPointer: A Graph Pointer Network for Extracting Diverse Keyphrases

107   0   0.0 ( 0 )
 نشر من قبل Zhiqing Sun
 تاريخ النشر 2019
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Keyphrase extraction from documents is useful to a variety of applications such as information retrieval and document summarization. This paper presents an end-to-end method called DivGraphPointer for extracting a set of diversified keyphrases from a document. DivGraphPointer combines the advantages of traditional graph-based ranking methods and recent neural network-based approaches. Specifically, given a document, a word graph is constructed from the document based on word proximity and is encoded with graph convolutional networks, which effectively capture document-level word salience by modeling long-range dependency between words in the document and aggregating multiple appearances of identical words into one node. Furthermore, we propose a diversified point network to generate a set of diverse keyphrases out of the word graph in the decoding process. Experimental results on five benchmark data sets show that our proposed method significantly outperforms the existing state-of-the-art approaches.



قيم البحث

اقرأ أيضاً

Recently, the sequence-to-sequence models have made remarkable progress on the task of keyphrase generation (KG) by concatenating multiple keyphrases in a predefined order as a target sequence during training. However, the keyphrases are inherently a n unordered set rather than an ordered sequence. Imposing a predefined order will introduce wrong bias during training, which can highly penalize shifts in the order between keyphrases. In this work, we propose a new training paradigm One2Set without predefining an order to concatenate the keyphrases. To fit this paradigm, we propose a novel model that utilizes a fixed set of learned control codes as conditions to generate a set of keyphrases in parallel. To solve the problem that there is no correspondence between each prediction and target during training, we propose a $K$-step target assignment mechanism via bipartite matching, which greatly increases the diversity and reduces the duplication ratio of generated keyphrases. The experimental results on multiple benchmarks demonstrate that our approach significantly outperforms the state-of-the-art methods.
Relation extraction is a type of information extraction task that recognizes semantic relationships between entities in a sentence. Many previous studies have focused on extracting only one semantic relation between two entities in a single sentence. However, multiple entities in a sentence are associated through various relations. To address this issue, we propose a relation extraction model based on a dual pointer network with a multi-head attention mechanism. The proposed model finds n-to-1 subject-object relations using a forward object decoder. Then, it finds 1-to-n subject-object relations using a backward subject decoder. Our experiments confirmed that the proposed model outperformed previous models, with an F1-score of 80.8% for the ACE-2005 corpus and an F1-score of 78.3% for the NYT corpus.
92 - Junheng Huang , Lu Pan , Kang Xu 2020
Comment generation, a new and challenging task in Natural Language Generation (NLG), attracts a lot of attention in recent years. However, comments generated by previous work tend to lack pertinence and diversity. In this paper, we propose a novel ge neration model based on Topic-aware Pointer-Generator Networks (TPGN), which can utilize the topic information hidden in the articles to guide the generation of pertinent and diversified comments. Firstly, we design a keyword-level and topic-level encoder attention mechanism to capture topic information in the articles. Next, we integrate the topic information into pointer-generator networks to guide comment generation. Experiments on a large scale of comment generation dataset show that our model produces the valuable comments and outperforms competitive baseline models significantly.
The COVID-19 pandemic has spawned a diverse body of scientific literature that is challenging to navigate, stimulating interest in automated tools to help find useful knowledge. We pursue the construction of a knowledge base (KB) of mechanisms -- a f undamental concept across the sciences encompassing activities, functions and causal relations, ranging from cellular processes to economic impacts. We extract this information from the natural language of scientific papers by developing a broad, unified schema that strikes a balance between relevance and breadth. We annotate a dataset of mechanisms with our schema and train a model to extract mechanism relations from papers. Our experiments demonstrate the utility of our KB in supporting interdisciplinary scientific search over COVID-19 literature, outperforming the prominent PubMed search in a study with clinical experts.
Sentence ordering is one of important tasks in NLP. Previous works mainly focused on improving its performance by using pair-wise strategy. However, it is nontrivial for pair-wise models to incorporate the contextual sentence information. In addition , error prorogation could be introduced by using the pipeline strategy in pair-wise models. In this paper, we propose an end-to-end neural approach to address the sentence ordering problem, which uses the pointer network (Ptr-Net) to alleviate the error propagation problem and utilize the whole contextual information. Experimental results show the effectiveness of the proposed model. Source codes and dataset of this paper are available.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا