بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

On the Limits of Minimal Pairs in Contrastive Evaluation

143 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Jannis Vamvas

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Jannis Vamvas - Rico Sennrich

الحساب واللغة

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Minimal sentence pairs are frequently used to analyze the behavior of language models. It is often assumed that model behavior on contrastive pairs is predictive of model behavior at large. We argue that two conditions are necessary for this assumption to hold: First, a tested hypothesis should be well-motivated, since experiments show that contrastive evaluation can lead to false positives. Secondly, test data should be chosen such as to minimize distributional discrepancy between evaluation time and deployment time. For a good approximation of deployment-time decoding, we recommend that minimal pairs are created based on machine-generated text, as opposed to human-written references. We present a contrastive evaluation suite for English-German MT that implements this recommendation.

قيم البحث

295 - Alex Warstadt , Alicia Parrish , Haokun Liu 2019

We introduce The Benchmark of Linguistic Minimal Pairs (shortened to BLiMP), a challenge set for evaluating what language models (LMs) know about major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each containing 1000 minimal pairs isolating specific contrasts in syntax, morphology, or semantics. The data is automatically generated according to expert-crafted grammars, and aggregate human agreement with the labels is 96.4%. We use it to evaluate n-gram, LSTM, and Transformer (GPT-2 and Transformer-XL) LMs. We find that state-of-the-art models identify morphological contrasts reliably, but they struggle with semantic restrictions on the distribution of quantifiers and negative polarity items and subtle syntactic phenomena such as extraction islands.

الحساب واللغة

Limits on Spherical Coefficients in the Minimal-SME Photon Sector

66 - W.J. Jessup , N.E. Russell 2016

We place limits on spherical coefficients for Lorentz violation involving operators of dimension four in the photon sector of the minimal Standard-Model Extension. The bounds are deduced from existing experimental results with optical-cavity oscillators.

فيزياء الطاقة العالية - الظواهر

On the Diversity and Limits of Human Explanations

128 - Chenhao Tan 2021

A growing effort in NLP aims to build datasets of human explanations. However, the term explanation encompasses a broad range of notions, each with different properties and ramifications. Our goal is to provide an overview of diverse types of explana tions and human limitations, and discuss implications for collecting and using explanations in NLP. Inspired by prior work in psychology and cognitive sciences, we group existing human explanations in NLP into three categories: proximal mechanism, evidence, and procedure. These three types differ in nature and have implications for the resultant explanations. For instance, procedure is not considered explanations in psychology and connects with a rich body of work on learning from instructions. The diversity of explanations is further evidenced by proxy questions that are needed for annotators to interpret and answer open-ended why questions. Finally, explanations may require different, often deeper, understandings than predictions, which casts doubt on whether humans can provide useful explanations in some tasks.

الحساب واللغة الذكاء الاصطناعي أجهزة الكمبيوتر والمجتمع

Numerical Evaluation of the Bose-Ghost Propagator in Minimal Landau Gauge on the Lattice

89 - Attilio Cucchieri , Tereza Mendes 2016

We present numerical details of the evaluation of the so-called Bose-ghost propagator in lattice minimal Landau gauge, for the SU(2) case in four Euclidean dimensions. This quantity has been proposed as a carrier of the confining force in the Gribov- Zwanziger approach and, as such, its infrared behavior could be relevant for the understanding of color confinement in Yang-Mills theories. Also, its nonzero value can be interpreted as direct evidence of BRST-symmetry breaking, which is induced when restricting the functional measure to the first Gribov region Omega. Our simulations are done for lattice volumes up to 120^4 and for physical lattice extents up to 13.5 fm. We investigate the infinite-volume and continuum limits.

فيزياء الطاقة العالية - شعرية

On the Evaluation of Machine Translation for Terminology Consistency

141 - Md Mahfuz ibn Alam , Antonios Anastasopoulos , Laurent Besacier 2021

As neural machine translation (NMT) systems become an important part of professional translator pipelines, a growing body of work focuses on combining NMT with terminologies. In many scenarios and particularly in cases of domain adaptation, one expec ts the MT output to adhere to the constraints provided by a terminology. In this work, we propose metrics to measure the consistency of MT output with regards to a domain terminology. We perform studies on the COVID-19 domain over 5 languages, also performing terminology-targeted human evaluation. We open-source the code for computing all proposed metrics: https://github.com/mahfuzibnalam/terminology_evaluation

الحساب واللغة

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

الجامعة السورية الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

On the Limits of Minimal Pairs in Contrastive Evaluation

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً