Do you want to publish a course? Click here

Graph Convolutional Networks for Event Causality Identification with Rich Document-level Structures

تشخيص شبكات تشفيرية لتحديد السببية الحدث مع هياكل مستوى الوثائق الغنية

313   0   0   0.0 ( 0 )
 Publication date 2021
and research's language is English
 Created by Shamra Editor




Ask ChatGPT about the research

We study the problem of Event Causality Identification (ECI) to detect causal relation between event mention pairs in text. Although deep learning models have recently shown state-of-the-art performance for ECI, they are limited to the intra-sentence setting where event mention pairs are presented in the same sentences. This work addresses this issue by developing a novel deep learning model for document-level ECI (DECI) to accept inter-sentence event mention pairs. As such, we propose a graph-based model that constructs interaction graphs to capture relevant connections between important objects for DECI in input documents. Such interaction graphs are then consumed by graph convolutional networks to learn document context-augmented representations for causality prediction between events. Various information sources are introduced to enrich the interaction graphs for DECI, featuring discourse, syntax, and semantic information. Our extensive experiments show that the proposed model achieves state-of-the-art performance on two benchmark datasets.



References used
https://aclanthology.org/
rate research

Read More

The goal of Event Factuality Prediction (EFP) is to determine the factual degree of an event mention, representing how likely the event mention has happened in text. Current deep learning models has demonstrated the importance of syntactic and semant ic structures of the sentences to identify important context words for EFP. However, the major problem with these EFP models is that they only encode the one-hop paths between the words (i.e., the direct connections) to form the sentence structures. In this work, we show that the multi-hop paths between the words are also necessary to compute the sentence structures for EFP. To this end, we introduce a novel deep learning model for EFP that explicitly considers multi-hop paths with both syntax-based and semantic-based edges between the words to obtain sentence structures for representation learning in EFP. We demonstrate the effectiveness of the proposed model via the extensive experiments in this work.
Event detection (ED) task aims to classify events by identifying key event trigger words embedded in a piece of text. Previous research have proved the validity of fusing syntactic dependency relations into Graph Convolutional Networks(GCN). While ex isting GCN-based methods explore latent node-to-node dependency relations according to a stationary adjacency tensor, an attention-based dynamic tensor, which can pay much attention to the key node like event trigger or its neighboring nodes, has not been developed. Simultaneously, suffering from the phenomenon of graph information vanishing caused by the symmetric adjacency tensor, existing GCN models can not achieve higher overall performance. In this paper, we propose a novel model Self-Attention Graph Residual Convolution Networks (SA-GRCN) to mine node-to-node latent dependency relations via self-attention mechanism and introduce Graph Residual Network (GResNet) to solve graph information vanishing problem. Specifically, a self-attention module is constructed to generate an attention tensor, representing the dependency attention scores of all words in the sentence. Furthermore, a graph residual term is added to the baseline SA-GCN to construct a GResNet. Considering the syntactically connection of the network input, we initialize the raw adjacency tensor without processed by the self-attention module as the residual term. We conduct experiments on the ACE2005 dataset and the results show significant improvement over competitive baseline methods.
Existing works on information extraction (IE) have mainly solved the four main tasks separately (entity mention recognition, relation extraction, event trigger detection, and argument extraction), thus failing to benefit from inter-dependencies betwe en tasks. This paper presents a novel deep learning model to simultaneously solve the four tasks of IE in a single model (called FourIE). Compared to few prior work on jointly performing four IE tasks, FourIE features two novel contributions to capture inter-dependencies between tasks. First, at the representation level, we introduce an interaction graph between instances of the four tasks that is used to enrich the prediction representation for one instance with those from related instances of other tasks. Second, at the label level, we propose a dependency graph for the information types in the four IE tasks that captures the connections between the types expressed in an input sentence. A new regularization mechanism is introduced to enforce the consistency between the golden and predicted type dependency graphs to improve representation learning. We show that the proposed model achieves the state-of-the-art performance for joint IE on both monolingual and multilingual learning settings with three different languages.
Recent works show that the graph structure of sentences, generated from dependency parsers, has potential for improving event detection. However, they often only leverage the edges (dependencies) between words, and discard the dependency labels (e.g. , nominal-subject), treating the underlying graph edges as homogeneous. In this work, we propose a novel framework for incorporating both dependencies and their labels using a recently proposed technique called Graph Transformer Network (GTN). We integrate GTN to leverage dependency relations on two existing homogeneous-graph-based models and demonstrate an improvement in the F1 score on the ACE dataset.
Recent progress in pretrained Transformer-based language models has shown great success in learning contextual representation of text. However, due to the quadratic self-attention complexity, most of the pretrained Transformers models can only handle relatively short text. It is still a challenge when it comes to modeling very long documents. In this work, we propose to use a graph attention network on top of the available pretrained Transformers model to learn document embeddings. This graph attention network allows us to leverage the high-level semantic structure of the document. In addition, based on our graph document model, we design a simple contrastive learning strategy to pretrain our models on a large amount of unlabeled corpus. Empirically, we demonstrate the effectiveness of our approaches in document classification and document retrieval tasks.

suggested questions

comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا