No Arabic abstract
The enormous amount of discourse taking place online poses challenges to the functioning of a civil and informed public sphere. Efforts to standardize online discourse data, such as ClaimReview, are making available a wealth of new data about potentially inaccurate claims, reviewed by third-party fact-checkers. These data could help shed light on the nature of online discourse, the role of political elites in amplifying it, and its implications for the integrity of the online information ecosystem. Unfortunately, the semi-structured nature of much of this data presents significant challenges when it comes to modeling and reasoning about online discourse. A key challenge is relation extraction, which is the task of determining the semantic relationships between named entities in a claim. Here we develop a novel supervised learning method for relation extraction that combines graph embedding techniques with path traversal on semantic dependency graphs. Our approach is based on the intuitive observation that knowledge of the entities along the path between the subject and object of a triple (e.g. Washington,_D.C.}, and United_States_of_America) provides useful information that can be leveraged for extracting its semantic relation (i.e. capitalOf). As an example of a potential application of this technique for modeling online discourse, we show that our method can be integrated into a pipeline to reason about potential misinformation claims.
With the increasing popularity of social media, online interpersonal communication now plays an essential role in peoples everyday information exchange. Whether and how a newcomer can better engage in the community has attracted great interest due to its application in many scenarios. Although some prior works that explore early socialization have obtained salient achievements, they are focusing on sociological surveys based on the small group. To help individuals get through the early socialization period and engage well in online conversations, we study a novel task to foresee whether a newcomers message will be responded to by other participants in a multi-party conversation (henceforth textbf{Successful New-entry Prediction}). The task would be an important part of the research in online assistants and social media. To further investigate the key factors indicating such engagement success, we employ an unsupervised neural network, Variational Auto-Encoder (textbf{VAE}), to examine the topic content and discourse behavior from newcomers chatting history and conversations ongoing context. Furthermore, two large-scale datasets, from Reddit and Twitter, are collected to support further research on new-entries. Extensive experiments on both Twitter and Reddit datasets show that our model significantly outperforms all the baselines and popular neural models. Additional explainable and visual analyses on new-entry behavior shed light on how to better join in others discussions.
This paper is focused on the computational analysis of collective discourse, a collective behavior seen in non-expert content contributions in online social media. We collect and analyze a wide range of real-world collective discourse datasets from movie user reviews to microblogs and news headlines to scientific citations. We show that all these datasets exhibit diversity of perspective, a property seen in other collective systems and a criterion in wise crowds. Our experiments also confirm that the network of different perspective co-occurrences exhibits the small-world property with high clustering of different perspectives. Finally, we show that non-expert contributions in collective discourse can be used to answer simple questions that are otherwise hard to answer.
Current event-centric knowledge graphs highly rely on explicit connectives to mine relations between events. Unfortunately, due to the sparsity of connectives, these methods severely undermine the coverage of EventKGs. The lack of high-quality labelled corpora further exacerbates that problem. In this paper, we propose a knowledge projection paradigm for event relation extraction: projecting discourse knowledge to narratives by exploiting the commonalities between them. Specifically, we propose Multi-tier Knowledge Projection Network (MKPNet), which can leverage multi-tier discourse knowledge effectively for event relation extraction. In this way, the labelled data requirement is significantly reduced, and implicit event relations can be effectively extracted. Intrinsic experimental results show that MKPNet achieves the new state-of-the-art performance, and extrinsic experimental results verify the value of the extracted event relations.
Relation extraction aims to extract relational facts from sentences. Previous models mainly rely on manually labeled datasets, seed instances or human-crafted patterns, and distant supervision. However, the human annotation is expensive, while human-crafted patterns suffer from semantic drift and distant supervision samples are usually noisy. Domain adaptation methods enable leveraging labeled data from a different but related domain. However, different domains usually have various textual relation descriptions and different label space (the source label space is usually a superset of the target label space). To solve these problems, we propose a novel model of relation-gated adversarial learning for relation extraction, which extends the adversarial based domain adaptation. Experimental results have shown that the proposed approach outperforms previous domain adaptation methods regarding partial domain adaptation and can improve the accuracy of distance supervised relation extraction through fine-tuning.
Relation extraction is the task of determining the relation between two entities in a sentence. Distantly-supervised models are popular for this task. However, sentences can be long and two entities can be located far from each other in a sentence. The pieces of evidence supporting the presence of a relation between two entities may not be very direct, since the entities may be connected via some indirect links such as a third entity or via co-reference. Relation extraction in such scenarios becomes more challenging as we need to capture the long-distance interactions among the entities and other words in the sentence. Also, the words in a sentence do not contribute equally in identifying the relation between the two entities. To address this issue, we propose a novel and effective attention model which incorporates syntactic information of the sentence and a multi-factor attention mechanism. Experiments on the New York Times corpus show that our proposed model outperforms prior state-of-the-art models.