New community

Subscribe to the gold package and get unlimited access to Shamra Academy

EDTC: A Corpus for Discourse-Level Topic Chain Parsing

EDTC: Corpus لتخليص سلسلة موضع مستوى الخطاب

456 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Discourse analysis has long been known to be fundamental in natural language processing. In this research, we present our insight on discourse-level topic chain (DTC) parsing which aims at discovering new topics and investigating how these topics evolve over time within an article. To address the lack of data, we contribute a new discourse corpus with DTC-style dependency graphs annotated upon news articles. In particular, we ensure the high reliability of the corpus by utilizing a two-step annotation strategy to build the data and filtering out the annotations with low confidence scores. Based on the annotated corpus, we introduce a simple yet robust system for automatic discourse-level topic chain parsing.

References used

https://aclanthology.org/

rate research

DELA Corpus - A Document-Level Corpus Annotated with Context-Related Issues

517 - Association for Computation Linguistics 2021 مقالة

Recently, the Machine Translation (MT) community has become more interested in document-level evaluation especially in light of reactions to claims of human parity'', since examining the quality at the level of the document rather than at the sentenc e level allows for the assessment of suprasentential context, providing a more reliable evaluation. This paper presents a document-level corpus annotated in English with context-aware issues that arise when translating from English into Brazilian Portuguese, namely ellipsis, gender, lexical ambiguity, number, reference, and terminology, with six different domains. The corpus can be used as a challenge test set for evaluation and as a training/testing corpus for MT as well as for deep linguistic analysis of context issues. To the best of our knowledge, this is the first corpus of its kind.

document-level corpus annotated dela corpus corpus annotated وصف مستوى المستند المشروح ديلا كوربوس corpus المشروح صناعة حمض الفوسفور المزيد..

A Language Model-based Generative Classifier for Sentence-level Discourse Parsing

391 - Association for Computation Linguistics 2021 مقالة

Discourse segmentation and sentence-level discourse parsing play important roles for various NLP tasks to consider textual coherence. Despite recent achievements in both tasks, there is still room for improvement due to the scarcity of labeled data. To solve the problem, we propose a language model-based generative classifier (LMGC) for using more information from labels by treating the labels as an input while enhancing label representations by embedding descriptions for each label. Moreover, since this enables LMGC to make ready the representations for labels, unseen in the pre-training step, we can effectively use a pre-trained language model in LMGC. Experimental results on the RST-DT dataset show that our LMGC achieved the state-of-the-art F1 score of 96.72 in discourse segmentation. It further achieved the state-of-the-art relation F1 scores of 84.69 with gold EDU boundaries and 81.18 with automatically segmented boundaries, respectively, in sentence-level discourse parsing.

sentence-level discourse parsing model-based generative classifier language model-based generative تحليل خطاب مستوى الجملة مصنف المولد النموذجي لغة اللغة القائمة على نموذج صناعة حمض الفوسفور المزيد..

Augmenting BERT-style Models with Predictive Coding to Improve Discourse-level Representations

341 - Association for Computation Linguistics 2021 مقالة

Current language models are usually trained using a self-supervised scheme, where the main focus is learning representations at the word or sentence level. However, there has been limited progress in generating useful discourse-level representations. In this work, we propose to use ideas from predictive coding theory to augment BERT-style language models with a mechanism that allows them to learn suitable discourse-level representations. As a result, our proposed approach is able to predict future sentences using explicit top-down connections that operate at the intermediate layers of the network. By experimenting with benchmarks designed to evaluate discourse-related knowledge using pre-trained sentence representations, we demonstrate that our approach improves performance in 6 out of 11 tasks by excelling in discourse relationship detection.

augmenting bert-style models discourse-level representations predictive coding زيادة نماذج نمط بيرت تمثيلات مستوى الخطاب الترميز التنبئي صناعة حمض الفوسفور المزيد..

Neural-based RST Parsing And Analysis In Persuasive Discourse

266 - Association for Computation Linguistics 2021 مقالة

Most of the existing studies of language use in social media content have focused on the surface-level linguistic features (e.g., function words and punctuation marks) and the semantic level aspects (e.g., the topics, sentiment, and emotions) of the comments. The writer's strategies of constructing and connecting text segments have not been widely explored even though this knowledge is expected to shed light on how people reason in online environments. Contributing to this analysis direction for social media studies, we build an openly accessible neural RST parsing system that analyzes discourse relations in an online comment. Our experiments demonstrate that this system achieves comparable performance among all the neural RST parsing systems. To demonstrate the use of this tool in social media analysis, we apply it to identify the discourse relations in persuasive and non-persuasive comments and examine the relationships among the binary discourse tree depth, discourse relations, and the perceived persuasiveness of online comments. Our work demonstrates the potential of analyzing discourse structures of online comments with our system and the implications of these structures for understanding online communications.

neural-based rst parsing neural rst parsing rst parsing تحليل RST القائم على العصبي التحليل الأول العصبي التحليل الأول صناعة حمض الفوسفور المزيد..

Salience-Aware Event Chain Modeling for Narrative Understanding

412 - Association for Computation Linguistics 2021 مقالة

Storytelling, whether via fables, news reports, documentaries, or memoirs, can be thought of as the communication of interesting and related events that, taken together, form a concrete process. It is desirable to extract the event chains that repres ent such processes. However, this extraction remains a challenging problem. We posit that this is due to the nature of the texts from which chains are discovered. Natural language text interleaves a narrative of concrete, salient events with background information, contextualization, opinion, and other elements that are important for a variety of necessary discourse and pragmatics acts but are not part of the principal chain of events being communicated. We introduce methods for extracting this principal chain from natural language text, by filtering away non-salient events and supportive sentences. We demonstrate the effectiveness of our methods at isolating critical event chains by comparing their effect on downstream tasks. We show that by pre-training large language models on our extracted chains, we obtain improvements in two tasks that benefit from a clear understanding of event chains: narrative prediction and event-based temporal question answering. The demonstrated improvements and ablative studies confirm that our extraction method isolates critical event chains.

event chain modeling chain modeling salience-aware event chain نموذج سلسلة الحدث سلاسل الحدث العلمي المدرك صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

EDTC: A Corpus for Discourse-Level Topic Chain Parsing

EDTC: Corpus لتخليص سلسلة موضع مستوى الخطاب

Ask ChatGPT about the research

Read More

suggested questions