New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Syntactic Perturbations Reveal Representational Correlates of Hierarchical Phrase Structure in Pretrained Language Models

الاضطرابات النحوية تكشف عن ترتبط التمثيلية بنية العبارة الهرمية في نماذج اللغة المحددة مسبقا

196 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

في حين أن تمثيل اللغة المستندة إلى المتجهات من النماذج اللغوية المحددة قد حددت معيارا جديدا للعديد من مهام NLP، إلا أنه ليس هناك حساب كامل لأعمالهم الداخلية. على وجه الخصوص، ليس من الواضح تماما ما يتم التقاط جوانب بناء جملة مستوى الجملة من خلال هذه التمثيلات، ولا (إذا كان على الإطلاق) بنيت على طول الطبقات المكدسة من الشبكة. في هذه الورقة، نهدف إلى معالجة هذه الأسئلة مع فئة عامة من التحليلات المستندة إلى اضطرابات التدخل، والإدخال المستندة إلى الإدخال من النماذج اللغوية المحددة مسبقا. استيراد من علم الأعصاب الحسابي والمعرفي فكرة الثابتة التمثيلية، نقوم بإجراء سلسلة من المجسات المصممة لاختبار حساسية هذه التمثيلات لعدة أنواع الهيكل في الجمل. ينطوي كل مسبار على تبديل الكلمات في جملة ومقارنة التمثيلات من الجمل المضطربة ضد الأصل. نقوم بتجربة ثلاثة اضطرابات مختلفة: (1) تصامح عشوائية من نجمات N-Gram من عرض متفاوت، لاختبار النطاق الذي يمثل التمثيل حساسا لهذا المنصب؛ (2) تبديل اثنين من الأمور التي تفعل أو لا تشكل عبارة نصية، لاختبار الحساسية بنية العبارة العالمية؛ و (3) تبديل كلمات اثنين المجاورة التي تفعل أو لا تفكر عبارة نصية، لاختبار الحساسية بنية العبارة المحلية. تشير النتائج من هذه التحقيقات بشكل جماعي إلى أن المحولات تبني حساسية أجزاء أكبر من الجملة على طول طبقاتها، وأن هيكل العبارة الهرمية يلعب دورا في هذه العملية. على نطاق أوسع نطاقا، تشير نتائجنا أيضا إلى أن اضطرابات الإدخال المهيكلة تتسع نطاق التحليلات التي يمكن تنفيذها في أنظمة التعلم العميقة في كثير من الأحيان، ويمكن أن تكون بمثابة مكمل للأدوات الحالية (مثل التحقيقات الخطية الخاضعة للإشراف) لتفسير الصندوق الأسود المعقدة عارضات ازياء.

While vector-based language representations from pretrained language models have set a new standard for many NLP tasks, there is not yet a complete accounting of their inner workings. In particular, it is not entirely clear what aspects of sentence-level syntax are captured by these representations, nor how (if at all) they are built along the stacked layers of the network. In this paper, we aim to address such questions with a general class of interventional, input perturbation-based analyses of representations from pretrained language models. Importing from computational and cognitive neuroscience the notion of representational invariance, we perform a series of probes designed to test the sensitivity of these representations to several kinds of structure in sentences. Each probe involves swapping words in a sentence and comparing the representations from perturbed sentences against the original. We experiment with three different perturbations: (1) random permutations of n-grams of varying width, to test the scale at which a representation is sensitive to word position; (2) swapping of two spans which do or do not form a syntactic phrase, to test sensitivity to global phrase structure; and (3) swapping of two adjacent words which do or do not break apart a syntactic phrase, to test sensitivity to local phrase structure. Results from these probes collectively suggest that Transformers build sensitivity to larger parts of the sentence along their layers, and that hierarchical phrase structure plays a role in this process. More broadly, our results also indicate that structured input perturbations widens the scope of analyses that can be performed on often-opaque deep learning systems, and can serve as a complement to existing tools (such as supervised linear probes) for interpreting complex black-box models.

References used

https://aclanthology.org/

rate research

Discourse Probing of Pretrained Language Models

387 - Association for Computation Linguistics 2021 مقالة

Existing work on probing of pretrained language models (LMs) has predominantly focused on sentence-level syntactic tasks. In this paper, we introduce document-level discourse probing to evaluate the ability of pretrained LMs to capture document-level relations. We experiment with 7 pretrained LMs, 4 languages, and 7 discourse probing tasks, and find BART to be overall the best model at capturing discourse --- but only in its encoder, with BERT performing surprisingly well as the baseline model. Across the different models, there are substantial differences in which layers best capture discourse information, and large disparities between models.

تثق صناعة حمض الفوسفور

Measuring and Improving Consistency in Pretrained Language Models

401 - Association for Computation Linguistics 2021 مقالة

Abstract Consistency of a model---that is, the invariance of its behavior under meaning-preserving alternations in its input---is a highly desirable property in natural language processing. In this paper we study the question: Are Pretrained Language Models (PLMs) consistent with respect to factual knowledge? To this end, we create ParaRel?, a high-quality resource of cloze-style query English paraphrases. It contains a total of 328 paraphrases for 38 relations. Using ParaRel?, we show that the consistency of all PLMs we experiment with is poor--- though with high variance between relations. Our analysis of the representational spaces of PLMs suggests that they have a poor structure and are currently not suitable for representing knowledge robustly. Finally, we propose a method for improving model consistency and experimentally demonstrate its effectiveness.1

pretrained language models pretrained language نماذج اللغة المحددة مسبقا اللغة المحددة صناعة حمض الفوسفور

Generating Datasets with Pretrained Language Models

342 - Association for Computation Linguistics 2021 مقالة

To obtain high-quality sentence embeddings from pretrained language models (PLMs), they must either be augmented with additional pretraining objectives or finetuned on a large set of labeled text pairs. While the latter approach typically outperforms the former, it requires great human effort to generate suitable datasets of sufficient size. In this paper, we show how PLMs can be leveraged to obtain high-quality sentence embeddings without the need for labeled data, finetuning or modifications to the pretraining objective: We utilize the generative abilities of large and high-performing PLMs to generate entire datasets of labeled text pairs from scratch, which we then use for finetuning much smaller and more efficient models. Our fully unsupervised approach outperforms strong baselines on several semantic textual similarity datasets.

محاذاة الإجراءات صناعة حمض الفوسفور

Unsupervised Paraphrasing with Pretrained Language Models

399 - Association for Computation Linguistics 2021 مقالة

Paraphrase generation has benefited extensively from recent progress in the designing of training objectives and model architectures. However, previous explorations have largely focused on supervised methods, which require a large amount of labeled d ata that is costly to collect. To address this drawback, we adopt a transfer learning approach and propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting. Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking (DB). To enforce a surface form dissimilar from the input, whenever the language model emits a token contained in the source sequence, DB prevents the model from outputting the subsequent source token for the next generation step. We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair (QQP) and the ParaNMT datasets and is robust to domain shift between the two datasets of distinct distributions. We also demonstrate that our model transfers to paraphrasing in other languages without any additional finetuning.

مكافأة تقليد مختلفة paraphrasing with pretrained إعادة صياغة مع الاحاد صناعة حمض الفوسفور

Low-resource Taxonomy Enrichment with Pretrained Language Models

506 - Association for Computation Linguistics 2021 مقالة

Taxonomies are symbolic representations of hierarchical relationships between terms or entities. While taxonomies are useful in broad applications, manually updating or maintaining them is labor-intensive and difficult to scale in practice. Conventio nal supervised methods for this enrichment task fail to find optimal parents of new terms in low-resource settings where only small taxonomies are available because of overfitting to hierarchical relationships in the taxonomies. To tackle the problem of low-resource taxonomy enrichment, we propose Musubu, an efficient framework for taxonomy enrichment in low-resource settings with pretrained language models (LMs) as knowledge bases to compensate for the shortage of information. Musubu leverages an LM-based classifier to determine whether or not inputted term pairs have hierarchical relationships. Musubu also utilizes Hearst patterns to generate queries to leverage implicit knowledge from the LM efficiently for more accurate prediction. We empirically demonstrate the effectiveness of our method in extensive experiments on taxonomies from both a SemEval task and real-world retailer datasets.

استخراج علاقة الموارد low-resource taxonomy enrichment التخصيب التصنيف المنخفض صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Syntactic Perturbations Reveal Representational Correlates of Hierarchical Phrase Structure in Pretrained Language Models

الاضطرابات النحوية تكشف عن ترتبط التمثيلية بنية العبارة الهرمية في نماذج اللغة المحددة مسبقا

Ask ChatGPT about the research

Read More

suggested questions