New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Understanding Hard Negatives in Noise Contrastive Estimation

فهم السلبيات الصعبة في تقدير الضوضاء

188 0 0 0.0 ( 0 )

Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

noise contrastive estimation contrastive estimation noise contrastive تقدير الضوضاء النقيض التقدير من النقيض ضوضاء بفقاء صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The choice of negative examples is important in noise contrastive estimation. Recent works find that hard negatives---highest-scoring incorrect examples under the model---are effective in practice, but they are used without a formal justification. We develop analytical tools to understand the role of hard negatives. Specifically, we view the contrastive loss as a biased estimator of the gradient of the cross-entropy loss, and show both theoretically and empirically that setting the negative distribution to be the model distribution results in bias reduction. We also derive a general form of the score function that unifies various architectures used in text retrieval. By combining hard negatives with appropriate score functions, we obtain strong results on the challenging task of zero-shot entity linking.

References used

https://aclanthology.org/

rate research

Towards a Better Understanding of Noise in Natural Language Processing

314 - Association for Computation Linguistics 2021 مقالة

In this paper, we propose a definition and taxonomy of various types of non-standard textual content -- generally referred to as noise'' -- in Natural Language Processing (NLP). While data pre-processing is undoubtedly important in NLP, especially wh en dealing with user-generated content, a broader understanding of different sources of noise and how to deal with them is an aspect that has been largely neglected. We provide a comprehensive list of potential sources of noise, categorise and describe them, and show the impact of a subset of standard pre-processing strategies on different tasks. Our main goal is to raise awareness of non-standard content -- which should not always be considered as noise'' -- and of the need for careful, task-dependent pre-processing. This is an alternative to blanket, all-encompassing solutions generally applied by researchers through standard'' pre-processing pipelines. The intention is for this categorisation to serve as a point of reference to support NLP researchers in devising strategies to clean, normalise or embrace non-standard content.

الانتحال عبر اللغة صناعة حمض الفوسفور

Not All Negatives are Equal: Label-Aware Contrastive Loss for Fine-grained Text Classification

421 - Association for Computation Linguistics 2021 مقالة

Fine-grained classification involves dealing with datasets with larger number of classes with subtle differences between them. Guiding the model to focus on differentiating dimensions between these commonly confusable classes is key to improving perf ormance on fine-grained tasks. In this work, we analyse the contrastive fine-tuning of pre-trained language models on two fine-grained text classification tasks, emotion classification and sentiment analysis. We adaptively embed class relationships into a contrastive objective function to help differently weigh the positives and negatives, and in particular, weighting closely confusable negatives more than less similar negative examples. We find that Label-aware Contrastive Loss outperforms previous contrastive methods, in the presence of larger number and/or more confusable classes, and helps models to produce output distributions that are more differentiated.

تحكيم الأرض المسافة label-aware contrastive loss التسمية على علم فقدان صناعة حمض الفوسفور

Noise Stability Regularization for Improving BERT Fine-tuning

284 - Association for Computation Linguistics 2021 مقالة

Fine-tuning pre-trained language models suchas BERT has become a common practice dom-inating leaderboards across various NLP tasks.Despite its recent success and wide adoption,this process is unstable when there are onlya small number of training sam ples available.The brittleness of this process is often reflectedby the sensitivity to random seeds. In this pa-per, we propose to tackle this problem basedon the noise stability property of deep nets,which is investigated in recent literature (Aroraet al., 2018; Sanyal et al., 2020). Specifically,we introduce a novel and effective regulariza-tion method to improve fine-tuning on NLPtasks, referred to asLayer-wiseNoiseStabilityRegularization (LNSR). We extend the theo-ries about adding noise to the input and provethat our method gives a stabler regularizationeffect. We provide supportive evidence by ex-perimentally confirming that well-performingmodels show a low sensitivity to noise andfine-tuning with LNSR exhibits clearly bet-ter generalizability and stability. Furthermore,our method also demonstrates advantages overother state-of-the-art algorithms including L2-SP (Li et al., 2018), Mixout (Lee et al., 2020)and SMART (Jiang et al., 20)

improving bert fine-tuning regularization for improving improving bert تحسين بيرت صقل التنظيم للتحسين تحسين بيرت صناعة حمض الفوسفور المزيد..

In-Batch Negatives for Knowledge Distillation with Tightly-Coupled Teachers for Dense Retrieval

185 - Association for Computation Linguistics 2021 مقالة

We present an efficient training approach to text retrieval with dense representations that applies knowledge distillation using the ColBERT late-interaction ranking model. Specifically, we propose to transfer the knowledge from a bi-encoder teacher to a student by distilling knowledge from ColBERT's expressive MaxSim operator into a simple dot product. The advantage of the bi-encoder teacher--student setup is that we can efficiently add in-batch negatives during knowledge distillation, enabling richer interactions between teacher and student models. In addition, using ColBERT as the teacher reduces training cost compared to a full cross-encoder. Experiments on the MS MARCO passage and document ranking tasks and data from the TREC 2019 Deep Learning Track demonstrate that our approach helps models learn robust representations for dense retrieval effectively and efficiently.

المنطق العلائقي dense retrieval applies knowledge distillation استرجاع كثيف ينطبق على تقطير المعرفة صناعة حمض الفوسفور

Reconsidering Annotator Disagreement about Racist Language: Noise or Signal?

202 - Association for Computation Linguistics 2021 مقالة

An abundance of methodological work aims to detect hateful and racist language in text. However, these tools are hampered by problems like low annotator agreement and remain largely disconnected from theoretical work on race and racism in the social sciences. Using annotations of 5188 tweets from 291 annotators, we investigate how annotator perceptions of racism in tweets vary by annotator racial identity and two text features of the tweets: relevant keywords and latent topics identified through structural topic modeling. We provide a descriptive summary of our data and estimate a series of generalized linear models to determine if annotator racial identity and our 12 latent topics, alone or in combination, explain the way racial sentiment was annotated, net of relevant annotator characteristics and tweet features. Our results show that White and non-White annotators exhibit significant differences in ratings when reading tweets with high prevalence of particular, racially-charged topics. We conclude by suggesting how future methodological work can draw on our results and further incorporate social science theory into analyses.

noise or signal reconsidering annotator disagreement racist language الضوضاء أو الإشارة إعادة النظر في خلط النحيين لغة عنصرية صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Understanding Hard Negatives in Noise Contrastive Estimation

فهم السلبيات الصعبة في تقدير الضوضاء

Ask ChatGPT about the research

Read More

suggested questions