New community

Subscribe to the gold package and get unlimited access to Shamra Academy

ECONET: Effective Continual Pretraining of Language Models for Event Temporal Reasoning

Econet: الاحيلية المستمرة الفعالة من نماذج اللغة للحدث المنطقي الزمني

395 0 3 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

While pre-trained language models (PTLMs) have achieved noticeable success on many NLP tasks, they still struggle for tasks that require event temporal reasoning, which is essential for event-centric applications. We present a continual pre-training approach that equips PTLMs with targeted knowledge about event temporal relations. We design self-supervised learning objectives to recover masked-out event and temporal indicators and to discriminate sentences from their corrupted counterparts (where event or temporal indicators got replaced). By further pre-training a PTLM with these objectives jointly, we reinforce its attention to event and temporal information, yielding enhanced capability on event temporal reasoning. This Effective CONtinual pre-training framework for Event Temporal reasoning (ECONET) improves the PTLMs' fine-tuning performances across five relation extraction and question answering tasks and achieves new or on-par state-of-the-art performances in most of our downstream tasks.

References used

https://aclanthology.org/

rate research

Towards a Language Model for Temporal Commonsense Reasoning

508 - Association for Computation Linguistics 2021 مقالة

Temporal commonsense reasoning is a challenging task as it requires temporal knowledge usually not explicit in text. In this work, we propose an ensemble model for temporal commonsense reasoning. Our model relies on pre-trained contextual representat ions from transformer-based language models (i.e., BERT), and on a variety of training methods for enhancing model generalization: 1) multi-step fine-tuning using carefully selected auxiliary tasks and datasets, and 2) a specifically designed temporal masked language model task aimed to capture temporal commonsense knowledge. Our model greatly outperforms the standard fine-tuning approach and strong baselines on the MC-TACO dataset.

temporal commonsense reasoning commonsense reasoning temporal commonsense المنطق الزمني المنطقي المنطق المنطقي العمولة الزمنية صناعة حمض الفوسفور المزيد..

Extracting Event Temporal Relations via Hyperbolic Geometry

480 - Association for Computation Linguistics 2021 مقالة

Detecting events and their evolution through time is a crucial task in natural language understanding. Recent neural approaches to event temporal relation extraction typically map events to embeddings in the Euclidean space and train a classifier to detect temporal relations between event pairs. However, embeddings in the Euclidean space cannot capture richer asymmetric relations such as event temporal relations. We thus propose to embed events into hyperbolic spaces, which are intrinsically oriented at modeling hierarchical structures. We introduce two approaches to encode events and their temporal relations in hyperbolic spaces. One approach leverages hyperbolic embeddings to directly infer event relations through simple geometrical operations. In the second one, we devise an end-to-end architecture composed of hyperbolic neural units tailored for the temporal relation extraction task. Thorough experimental assessments on widely used datasets have shown the benefits of revisiting the tasks on a different geometrical space, resulting in state-of-the-art performance on several standard metrics. Finally, the ablation study and several qualitative analyses highlighted the rich event semantics implicitly encoded into hyperbolic spaces.

hyperbolic geometry temporal relations extracting event temporal هندسة القطعي العلاقات الزمنية استخراج الحدث الزمني صناعة حمض الفوسفور المزيد..

Mitigating Temporal-Drift: A Simple Approach to Keep NER Models Crisp

365 - Association for Computation Linguistics 2021 مقالة

Performance of neural models for named entity recognition degrades over time, becoming stale. This degradation is due to temporal drift, the change in our target variables' statistical properties over time. This issue is especially problematic for so cial media data, where topics change rapidly. In order to mitigate the problem, data annotation and retraining of models is common. Despite its usefulness, this process is expensive and time-consuming, which motivates new research on efficient model updating. In this paper, we propose an intuitive approach to measure the potential trendiness of tweets and use this metric to select the most informative instances to use for training. We conduct experiments on three state-of-the-art models on the Temporal Twitter Dataset. Our approach shows larger increases in prediction accuracy with less training data than the alternatives, making it an attractive, practical solution.

ner models crisp mitigating temporal-drift models crisp نماذج نير هش تخفيف الانجراف الزمني نماذج هش صناعة حمض الفوسفور المزيد..

QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering

434 - Association for Computation Linguistics 2021 مقالة

The problem of answering questions using knowledge from pre-trained language models (LMs) and knowledge graphs (KGs) presents two challenges: given a QA context (question and answer choice), methods need to (i) identify relevant knowledge from large KGs, and (ii) perform joint reasoning over the QA context and KG. Here we propose a new model, QA-GNN, which addresses the above challenges through two key innovations: (i) relevance scoring, where we use LMs to estimate the importance of KG nodes relative to the given QA context, and (ii) joint reasoning, where we connect the QA context and KG to form a joint graph, and mutually update their representations through graph-based message passing. We evaluate QA-GNN on the CommonsenseQA and OpenBookQA datasets, and show its improvement over existing LM and LM+KG models, as well as its capability to perform interpretable and structured reasoning, e.g., correctly handling negation in questions.

إجابة سؤال المحادثة صناعة حمض الفوسفور

How much pretraining data do language models need to learn syntax?

559 - Association for Computation Linguistics 2021 مقالة

Transformers-based pretrained language models achieve outstanding results in many well-known NLU benchmarks. However, while pretraining methods are very convenient, they are expensive in terms of time and resources. This calls for a study of the impa ct of pretraining data size on the knowledge of the models. We explore this impact on the syntactic capabilities of RoBERTa, using models trained on incremental sizes of raw text data. First, we use syntactic structural probes to determine whether models pretrained on more data encode a higher amount of syntactic information. Second, we perform a targeted syntactic evaluation to analyze the impact of pretraining data size on the syntactic generalization performance of the models. Third, we compare the performance of the different models on three downstream applications: part-of-speech tagging, dependency parsing and paraphrase identification. We complement our study with an analysis of the cost-benefit trade-off of training such models. Our experiments show that while models pretrained on more data encode more syntactic knowledge and perform better on downstream applications, they do not always offer a better performance across the different syntactic phenomena and come at a higher financial and environmental cost.

learn syntax pretraining data size تعلم بناء الجملة احتجاج حجم البيانات صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

ECONET: Effective Continual Pretraining of Language Models for Event Temporal Reasoning

Econet: الاحيلية المستمرة الفعالة من نماذج اللغة للحدث المنطقي الزمني

Ask ChatGPT about the research

Read More

suggested questions