New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Provable Limitations of Acquiring Meaning from Ungrounded Form: What Will Future Language Models Understand?

تثبت القيود المتمثلة في الحصول على معنى من الشكل غير المحدد: ما هي نماذج اللغة المستقبلية تفهمها؟

538 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

limitations of acquiring provable limitations future language models قيود الاستحواذ القيود القادمة نماذج اللغة المستقبلية صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

أدت نماذج اللغة التجريدية المدربة على مليارات الرموز مؤخرا إلى نتائج غير مسبوقة على العديد من مهام NLP. يثير هذا النجاح مسألة ما إذا كان النظام، من حيث المبدأ، يمكن للنظام فهم النص الخام دون الوصول إلى شكل أساس من أشكال التأريض. نحن نحقق رسميا قدرات الأنظمة التي لا تحصى للحصول على معنى. يركز تحليلنا على دور التأكيدات ": السياقات النصية التي توفر أدلة غير مباشرة حول الدلالات الأساسية. ندرس ما إذا كانت هناك تأكيدات تمكن نظام لمحاكاة التمثيلات التي تحافظ على العلاقات الدلالية مثل التكافؤ. نجد أن التأكيدات تمكن مضاهاة دلالات للغات التي تلبي فكرة قوية من الشفافية الدلالية. ومع ذلك، بالنسبة لفئات اللغات حيث يمكن أن يتخذ نفس التعبير قيم مختلفة في سياقات مختلفة، نوضح أن المحاكاة يمكن أن تصبح غير مقابلة. أخيرا، نناقش الاختلافات بين النموذج الرسمي واللغة الطبيعية، واستكشاف كيفية تعميم نتائجنا إلى وضع مشروط وغيرها من العلاقات الدلالية. معا، تشير نتائجنا إلى أن التأكيدات في التعليمات البرمجية أو اللغة لا توفر إشارة كافية للتمثيلات الدلالية المحاكمة بالكامل. نقوم بإضفاء الطابع الرسمي على الطرق التي يبدو أن نماذج لغة غير محظورة محدودة بشكل أساسي في قدرتها على فهم ".

Abstract Language models trained on billions of tokens have recently led to unprecedented results on many NLP tasks. This success raises the question of whether, in principle, a system can ever understand'' raw text without access to some form of grounding. We formally investigate the abilities of ungrounded systems to acquire meaning. Our analysis focuses on the role of assertions'': textual contexts that provide indirect clues about the underlying semantics. We study whether assertions enable a system to emulate representations preserving semantic relations like equivalence. We find that assertions enable semantic emulation of languages that satisfy a strong notion of semantic transparency. However, for classes of languages where the same expression can take different values in different contexts, we show that emulation can become uncomputable. Finally, we discuss differences between our formal model and natural language, exploring how our results generalize to a modal setting and other semantic relations. Together, our results suggest that assertions in code or language do not provide sufficient signal to fully emulate semantic representations. We formalize ways in which ungrounded language models appear to be fundamentally limited in their ability to understand''.

References used

https://aclanthology.org/

rate research

Distilling Word Meaning in Context from Pre-trained Language Models

368 - Association for Computation Linguistics 2021 مقالة

In this study, we propose a self-supervised learning method that distils representations of word meaning in context from a pre-trained masked language model. Word representations are the basis for context-aware lexical semantics and unsupervised sema ntic textual similarity (STS) estimation. A previous study transforms contextualised representations employing static word embeddings to weaken excessive effects of contextual information. In contrast, the proposed method derives representations of word meaning in context while preserving useful context information intact. Specifically, our method learns to combine outputs of different hidden layers using self-attention through self-supervised learning with an automatically generated training corpus. To evaluate the performance of the proposed approach, we performed comparative experiments using a range of benchmark tasks. The results confirm that our representations exhibited a competitive performance compared to that of the state-of-the-art method transforming contextualised representations for the context-aware lexical semantic tasks and outperformed it for STS estimation.

السبب السببية distilling word meaning masked language model كلمة تقطارة معنى كلمة نموذج لغة ملثمين صناعة حمض الفوسفور

Stepmothers are mean and academics are pretentious: What do pretrained language models learn about you?

280 - Association for Computation Linguistics 2021 مقالة

In this paper, we investigate what types of stereotypical information are captured by pretrained language models. We present the first dataset comprising stereotypical attributes of a range of social groups and propose a method to elicit stereotypes encoded by pretrained language models in an unsupervised fashion. Moreover, we link the emergent stereotypes to their manifestation as basic emotions as a means to study their emotional effects in a more generalized manner. To demonstrate how our methods can be used to analyze emotion and stereotype shifts due to linguistic experience, we use fine-tuning on news sources as a case study. Our experiments expose how attitudes towards different social groups vary across models and how quickly emotions and stereotypes can shift at the fine-tuning stage.

تحويل ملثمين language models learn نماذج اللغة تعلم صناعة حمض الفوسفور

Memory and Knowledge Augmented Language Models for Inferring Salience in Long-Form Stories

348 - Association for Computation Linguistics 2021 مقالة

Measuring event salience is essential in the understanding of stories. This paper takes a recent unsupervised method for salience detection derived from Barthes Cardinal Functions and theories of surprise and applies it to longer narrative forms. We improve the standard transformer language model by incorporating an external knowledgebase (derived from Retrieval Augmented Generation) and adding a memory mechanism to enhance performance on longer works. We use a novel approach to derive salience annotation using chapter-aligned summaries from the Shmoop corpus for classic literary works. Our evaluation against this data demonstrates that our salience detection model improves performance over and above a non-knowledgebase and memory augmented language model, both of which are crucial to this improvement.

knowledge augmented language knowledge augmented inferring salience المعرفة اللغة المعززة المعرفة المعزز استنتاج الصلبة صناعة حمض الفوسفور المزيد..

What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers

498 - Association for Computation Linguistics 2021 مقالة

GPT-3 shows remarkable in-context learning ability of large-scale language models (LMs) trained on hundreds of billion scale data. Here we address some remaining issues less reported by the GPT-3 paper, such as a non-English LM, the performances of d ifferent sized models, and the effect of recently introduced prompt optimization on in-context learning. To achieve this, we introduce HyperCLOVA, a Korean variant of 82B GPT-3 trained on a Korean-centric corpus of 560B tokens. Enhanced by our Korean-specific tokenization, HyperCLOVA with our training configuration shows state-of-the-art in-context zero-shot and few-shot learning performances on various downstream tasks in Korean. Also, we show the performance benefits of prompt-based learning and demonstrate how it can be integrated into the prompt engineering pipeline. Then we discuss the possibility of materializing the No Code AI paradigm by providing AI prototyping capabilities to non-experts of ML by introducing HyperCLOVA studio, an interactive prompt engineering interface. Lastly, we demonstrate the potential of our methods with three successful in-house applications.

language models bring generative pretrained transformers نماذج اللغة تجلب محولات الإنتاج المحددة مسبقا صناعة حمض الفوسفور

Language Models are Few-shot Multilingual Learners

251 - Association for Computation Linguistics 2021 مقالة

General-purpose language models have demonstrated impressive capabilities, performing on par with state-of-the-art approaches on a range of downstream natural language processing (NLP) tasks and benchmarks when inferring instructions from very few ex amples. Here, we evaluate the multilingual skills of the GPT and T5 models in conducting multi-class classification on non-English languages without any parameter updates. We show that, given a few English examples as context, pre-trained language models can predict not only English test samples but also non-English ones. Finally, we find the in-context few-shot cross-lingual prediction results of language models are significantly better than random prediction, and they are competitive compared to the existing state-of-the-art cross-lingual models and translation models.

few-shot multilingual learners multilingual learners عدد قليل من المتعلمين متعدد اللغات المتعلمين متعدد اللغات صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Provable Limitations of Acquiring Meaning from Ungrounded Form: What Will Future Language Models Understand?

تثبت القيود المتمثلة في الحصول على معنى من الشكل غير المحدد: ما هي نماذج اللغة المستقبلية تفهمها؟

Ask ChatGPT about the research

Read More

suggested questions