Do you want to publish a course? Click here

MultiOpEd: A Corpus of Multi-Perspective News Editorials

متعددة المنصوص عليها: كائنات افتتاحية أخبار متعددة المنظور

279   0   0   0.0 ( 0 )
 Publication date 2021
and research's language is English
 Created by Shamra Editor




Ask ChatGPT about the research

We propose MultiOpEd, an open-domain news editorial corpus that supports various tasks pertaining to the argumentation structure in news editorials, focusing on automatic perspective discovery. News editorial is a genre of persuasive text, where the argumentation structure is usually implicit. However, the arguments presented in an editorial typically center around a concise, focused thesis, which we refer to as their perspective. MultiOpEd aims at supporting the study of multiple tasks relevant to automatic perspective discovery, where a system is expected to produce a single-sentence thesis statement summarizing the arguments presented. We argue that identifying and abstracting such natural language perspectives from editorials is a crucial step toward studying the implicit argumentation structure in news editorials. We first discuss the challenges and define a few conceptual tasks towards our goal. To demonstrate the utility of MultiOpEd and the induced tasks, we study the problem of perspective summarization in a multi-task learning setting, as a case study. We show that, with the induced tasks as auxiliary tasks, we can improve the quality of the perspective summary generated. We hope that MultiOpEd will be a useful resource for future studies on argumentation in the news editorial domain.



References used
https://aclanthology.org/
rate research

Read More

The way information is generated and disseminated has changed dramatically over the last decade. Identifying the political perspective shaping the way events are discussed in the media becomes more important due to the sharp increase in the number of news outlets and articles. Previous approaches usually only leverage linguistic information. However, news articles attempt to maintain credibility and seem impartial. Therefore, bias is introduced in subtle ways, usually by emphasizing different aspects of the story. In this paper, we propose a novel framework that considers entities mentioned in news articles and external knowledge about them, capturing the bias with respect to those entities. We explore different ways to inject entity information into the text model. Experiments show that our proposed framework achieves significant improvements over the standard text models, and is capable of identifying the difference in news narratives with different perspectives.
Information overload has been one of the challenges regarding information from the Internet. It is not a matter of information access, instead, the focus had shifted towards the quality of the retrieved data. Particularly in the news domain, multiple outlets report on the same news events but may differ in details. This work considers that different news outlets are more likely to differ in their writing styles and the choice of words, and proposes a method to extract sentences based on their key information by focusing on the shared synonyms in each sentence. Our method also attempts to reduce redundancy through hierarchical clustering and arrange selected sentences on the proposed orderBERT. The results show that the proposed unsupervised framework successfully improves the coverage, coherence, and, meanwhile, reduces the redundancy for a generated summary. Moreover, due to the process of obtaining the dataset, we also propose a data refinement method to alleviate the problems of undesirable texts, which result from the process of automatic scraping.
The casual, neutral, and formal language registers are highly perceptible in discourse productions. However, they are still poorly studied in Natural Language Processing (NLP), especially outside English, and for new textual types like tweets. To sti mulate research, this paper introduces a large corpus of 228,505 French tweets (6M words) annotated in language registers. Labels are provided by a multi-label CamemBERT classifier trained and checked on a manually annotated subset of the corpus, while the tweets are selected to avoid undesired biases. Based on the corpus, an initial analysis of linguistic traits from either human annotators or automatic extractions is provided to describe the corpus and pave the way for various NLP tasks. The corpus, annotation guide and classifier are available on http://tremolo.irisa.fr.
We propose Visual News Captioner, an entity-aware model for the task of news image captioning. We also introduce Visual News, a large-scale benchmark consisting of more than one million news images along with associated news articles, image captions, author information, and other metadata. Unlike the standard image captioning task, news images depict situations where people, locations, and events are of paramount importance. Our proposed method can effectively combine visual and textual features to generate captions with richer information such as events and entities. More specifically, built upon the Transformer architecture, our model is further equipped with novel multi-modal feature fusion techniques and attention mechanisms, which are designed to generate named entities more accurately. Our method utilizes much fewer parameters while achieving slightly better prediction results than competing methods. Our larger and more diverse Visual News dataset further highlights the remaining challenges in captioning news images.
Multi-hop relation detection in Knowledge Base Question Answering (KBQA) aims at retrieving the relation path starting from the topic entity to the answer node based on a given question, where the relation path may comprise multiple relations. Most o f the existing methods treat it as a single-label learning problem while ignoring the fact that for some complex questions, there exist multiple correct relation paths in knowledge bases. Therefore, in this paper, multi-hop relation detection is considered as a multi-label learning problem. However, performing multi-label multi-hop relation detection is challenging since the numbers of both the labels and the hops are unknown. To tackle this challenge, multi-label multi-hop relation detection is formulated as a sequence generation task. A relation-aware sequence relation generation model is proposed to solve the problem in an end-to-end manner. Experimental results show the effectiveness of the proposed method for relation detection and KBQA.

suggested questions

comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا