Do you want to publish a course? Click here

When is Wall a Pared and when a Muro?: Extracting Rules Governing Lexical Selection

عندما يكون الجدار بقلية ومتى مورو؟: قواعد استخراج تحكم اختيار المعجم

410   0   0   0.0 ( 0 )
 Publication date 2021
and research's language is English
 Created by Shamra Editor




Ask ChatGPT about the research

Learning fine-grained distinctions between vocabulary items is a key challenge in learning a new language. For example, the noun wall'' has different lexical manifestations in Spanish -- pared'' refers to an indoor wall while muro'' refers to an outside wall. However, this variety of lexical distinction may not be obvious to non-native learners unless the distinction is explained in such a way. In this work, we present a method for automatically identifying fine-grained lexical distinctions, and extracting rules explaining these distinctions in a human- and machine-readable format. We confirm the quality of these extracted rules in a language learning setup for two languages, Spanish and Greek, where we use the rules to teach non-native speakers when to translate a given ambiguous word into its different possible translations.



References used
https://aclanthology.org/
rate research

Read More

Models of language trained on very large corpora have been demonstrated useful for natural language processing. As fixed artifacts, they have become the object of intense study, with many researchers probing'' the extent to which they acquire and rea dily demonstrate linguistic abstractions, factual and commonsense knowledge, and reasoning abilities. Recent work applied several probes to intermediate training stages to observe the developmental process of a large-scale model (Chiang et al., 2020). Following this effort, we systematically answer a question: for various types of knowledge a language model learns, when during (pre)training are they acquired? Using RoBERTa as a case study, we find: linguistic knowledge is acquired fast, stably, and robustly across domains. Facts and commonsense are slower and more domain-sensitive. Reasoning abilities are, in general, not stably acquired. As new datasets, pretraining protocols, and probes emerge, we believe that probing-across-time analyses can help researchers understand the complex, intermingled learning that these models undergo and guide us toward more efficient approaches that accomplish necessary learning faster.
The staircase visibility concerns with the study of orthogonal polygon, one of the most important subjects which are studied is the Specification kernel of the orthogonal starshaped set. Toranzos represent a very important result in Specifying the ke rnel of the starshaped set in the usual notion of visibility via segments, after that Breen presented an analogue to this result of the staircase visibility. She also could find a way for Specifying the kernel of starshaped orthogonal polygon when this orthogonal polygon is a simply connected. The aim of this paper is generalizing the previous way when the orthogonal polygon is secondly connected and the bounded component for the complement is a rectangular; we will prove the following result: Let , be secondly connected closed orthogonal polygon, and staircase starshaped set. If the boundary of the bounded component for the complement is a rectangle ,so the kernel of is either one component or two or four ones.
The paper describes the MilaNLP team's submission (Bocconi University, Milan) in the WASSA 2021 Shared Task on Empathy Detection and Emotion Classification. We focus on Track 2 - Emotion Classification - which consists of predicting the emotion of re actions to English news stories at the essay-level. We test different models based on multi-task and multi-input frameworks. The goal was to better exploit all the correlated information given in the data set. We find, though, that empathy as an auxiliary task in multi-task learning and demographic attributes as additional input provide worse performance with respect to single-task learning. While the result is competitive in terms of the competition, our results suggest that emotion and empathy are not related tasks - at least for the purpose of prediction.
Transfer learning based on pretraining language models on a large amount of raw data has become a new norm to reach state-of-the-art performance in NLP. Still, it remains unclear how this approach should be applied for unseen languages that are not c overed by any available large-scale multilingual language model and for which only a small amount of raw data is generally available. In this work, by comparing multilingual and monolingual models, we show that such models behave in multiple ways on unseen languages. Some languages greatly benefit from transfer learning and behave similarly to closely related high resource languages whereas others apparently do not. Focusing on the latter, we show that this failure to transfer is largely related to the impact of the script used to write such languages. We show that transliterating those languages significantly improves the potential of large-scale multilingual language models on downstream tasks. This result provides a promising direction towards making these massively multilingual models useful for a new set of unseen languages.
We introduce Classification with Alternating Normalization (CAN), a non-parametric post-processing step for classification. CAN improves classification accuracy for challenging examples by re-adjusting their predicted class probability distribution u sing the predicted class distributions of high-confidence validation examples. CAN is easily applicable to any probabilistic classifier, with minimal computation overhead. We analyze the properties of CAN using simulated experiments, and empirically demonstrate its effectiveness across a diverse set of classification tasks.

suggested questions

comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا