Do you want to publish a course? Click here

Did the Cat Drink the Coffee? Challenging Transformers with Generalized Event Knowledge

هل تناول القط القهوة؟المحولات الصعبة مع معرفة الحدث المعمم

438   0   0   0.0 ( 0 )
 Publication date 2021
and research's language is English
 Created by Shamra Editor




Ask ChatGPT about the research

Prior research has explored the ability of computational models to predict a word semantic fit with a given predicate. While much work has been devoted to modeling the typicality relation between verbs and arguments in isolation, in this paper we take a broader perspective by assessing whether and to what extent computational approaches have access to the information about the typicality of entire events and situations described in language (Generalized Event Knowledge). Given the recent success of Transformers Language Models (TLMs), we decided to test them on a benchmark for the dynamic estimation of thematic fit. The evaluation of these models was performed in comparison with SDM, a framework specifically designed to integrate events in sentence meaning representations, and we conducted a detailed error analysis to investigate which factors affect their behavior. Our results show that TLMs can reach performances that are comparable to those achieved by SDM. However, additional analysis consistently suggests that TLMs do not capture important aspects of event knowledge, and their predictions often depend on surface linguistic features, such as frequent words, collocations and syntactic patterns, thereby showing sub-optimal generalization abilities.



References used
https://aclanthology.org/
rate research

Read More

Computational linguistic research on language change through distributional semantic (DS) models has inspired researchers from fields such as philosophy and literary studies, who use these methods for the exploration and comparison of comparatively s mall datasets traditionally analyzed by close reading. Research on methods for small data is still in early stages and it is not clear which methods achieve the best results. We investigate the possibilities and limitations of using distributional semantic models for analyzing philosophical data by means of a realistic use-case. We provide a ground truth for evaluation created by philosophy experts and a blueprint for using DS models in a sound methodological setup. We compare three methods for creating specialized models from small datasets. Though the models do not perform well enough to directly support philosophers yet, we find that models designed for small data yield promising directions for future work.
Understanding natural language requires common sense, one aspect of which is the ability to discern the plausibility of events. While distributional models---most recently pre-trained, Transformer language models---have demonstrated improvements in m odeling event plausibility, their performance still falls short of humans'. In this work, we show that Transformer-based plausibility models are markedly inconsistent across the conceptual classes of a lexical hierarchy, inferring that a person breathing'' is plausible while a dentist breathing'' is not, for example. We find this inconsistency persists even when models are softly injected with lexical knowledge, and we present a simple post-hoc method of forcing model consistency that improves correlation with human plausibility judgements.
We present a system for learning generalized, stereotypical patterns of events---or schemas''---from natural language stories, and applying them to make predictions about other stories. Our schemas are represented with Episodic Logic, a logical form that closely mirrors natural language. By beginning with a head start'' set of protoschemas--- schemas that a 1- or 2-year-old child would likely know---we can obtain useful, general world knowledge with very few story examples---often only one or two. Learned schemas can be combined into more complex, composite schemas, and used to make predictions in other stories where only partial information is available.
Large scale pretrained models have demonstrated strong performances on several natural language generation and understanding benchmarks. However, introducing commonsense into them to generate more realistic text remains a challenge. Inspired from pre vious work on commonsense knowledge generation and generative commonsense reasoning, we introduce two methods to add commonsense reasoning skills and knowledge into abstractive summarization models. Both methods beat the baseline on ROUGE scores, demonstrating the superiority of our models over the baseline. Human evaluation results suggest that summaries generated by our methods are more realistic and have fewer commonsensical errors.
The goal of Event Factuality Prediction (EFP) is to determine the factual degree of an event mention, representing how likely the event mention has happened in text. Current deep learning models has demonstrated the importance of syntactic and semant ic structures of the sentences to identify important context words for EFP. However, the major problem with these EFP models is that they only encode the one-hop paths between the words (i.e., the direct connections) to form the sentence structures. In this work, we show that the multi-hop paths between the words are also necessary to compute the sentence structures for EFP. To this end, we introduce a novel deep learning model for EFP that explicitly considers multi-hop paths with both syntax-based and semantic-based edges between the words to obtain sentence structures for representation learning in EFP. We demonstrate the effectiveness of the proposed model via the extensive experiments in this work.

suggested questions

comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا