يختار من الأمثلة السلبية مهمة في تقدير الضوضاء المقاوم للضوضاء.تعمل الأعمال الحديثة على أن السلبيات الصلبة --- أعلى الأمثلة غير الصحيحة في النموذج --- فعالة في الممارسة العملية، لكنها تستخدم دون مبرر رسمي.نقوم بتطوير أدوات تحليلية لفهم دور السلبيات الصعبة.على وجه التحديد، نرى الخسارة المتعاقبة كقاعدة متحيزة لتدرج فقدان الانتروبين، وإظهار من الناحية النظرية والإيثابية التي تحدد التوزيع السلبي لتحقيق نتائج توزيع النموذج في تخفيض التحيز.ونحن نستمد أيضا شكل عام لوظيفة النتيجة التي تنص على العديد من البنيات المستخدمة في استرجاع النص.من خلال الجمع بين السلبيات الصعبة مع وظائف النتيجة المناسبة، نحصل على نتائج قوية على المهمة الصعبة الرامية إلى ربط الكيان الصفر.
The choice of negative examples is important in noise contrastive estimation. Recent works find that hard negatives---highest-scoring incorrect examples under the model---are effective in practice, but they are used without a formal justification. We develop analytical tools to understand the role of hard negatives. Specifically, we view the contrastive loss as a biased estimator of the gradient of the cross-entropy loss, and show both theoretically and empirically that setting the negative distribution to be the model distribution results in bias reduction. We also derive a general form of the score function that unifies various architectures used in text retrieval. By combining hard negatives with appropriate score functions, we obtain strong results on the challenging task of zero-shot entity linking.
References used
https://aclanthology.org/
In this paper, we propose a definition and taxonomy of various types of non-standard textual content -- generally referred to as noise'' -- in Natural Language Processing (NLP). While data pre-processing is undoubtedly important in NLP, especially wh
Fine-grained classification involves dealing with datasets with larger number of classes with subtle differences between them. Guiding the model to focus on differentiating dimensions between these commonly confusable classes is key to improving perf
Fine-tuning pre-trained language models suchas BERT has become a common practice dom-inating leaderboards across various NLP tasks.Despite its recent success and wide adoption,this process is unstable when there are onlya small number of training sam
We present an efficient training approach to text retrieval with dense representations that applies knowledge distillation using the ColBERT late-interaction ranking model. Specifically, we propose to transfer the knowledge from a bi-encoder teacher
An abundance of methodological work aims to detect hateful and racist language in text. However, these tools are hampered by problems like low annotator agreement and remain largely disconnected from theoretical work on race and racism in the social