Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Noise Stability Regularization for Improving BERT Fine-tuning

تنظيم استقرار الضوضاء لتحسين Bert Nine-Tuning

659 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Fine-tuning pre-trained language models suchas BERT has become a common practice dom-inating leaderboards across various NLP tasks.Despite its recent success and wide adoption,this process is unstable when there are onlya small number of training samples available.The brittleness of this process is often reflectedby the sensitivity to random seeds. In this pa-per, we propose to tackle this problem basedon the noise stability property of deep nets,which is investigated in recent literature (Aroraet al., 2018; Sanyal et al., 2020). Specifically,we introduce a novel and effective regulariza-tion method to improve fine-tuning on NLPtasks, referred to asLayer-wiseNoiseStabilityRegularization (LNSR). We extend the theo-ries about adding noise to the input and provethat our method gives a stabler regularizationeffect. We provide supportive evidence by ex-perimentally confirming that well-performingmodels show a low sensitivity to noise andfine-tuning with LNSR exhibits clearly bet-ter generalizability and stability. Furthermore,our method also demonstrates advantages overother state-of-the-art algorithms including L2-SP (Li et al., 2018), Mixout (Lee et al., 2020)and SMART (Jiang et al., 20)

References used

https://aclanthology.org/

rate research

YoungSheldon at SemEval-2021 Task 7: Fine-tuning Is All You Need

592 - Association for Computation Linguistics 2021 مقالة

In this paper, we describe our system used for SemEval 2021 Task 7: HaHackathon: Detecting and Rating Humor and Offense. We used a simple fine-tuning approach using different Pre-trained Language Models (PLMs) to evaluate their performance for humor and offense detection. For regression tasks, we averaged the scores of different models leading to better performance than the original models. We participated in all SubTasks. Our best performing system was ranked 4 in SubTask 1-b, 8 in SubTask 1-c, 12 in SubTask 2, and performed well in SubTask 1-a. We further show comprehensive results using different pre-trained language models which will help as baselines for future work.

معلومات التردد الكشف عن صناعة حمض الفوسفور

Posterior Differential Regularization with f-divergence for Improving Model Robustness

742 - Association for Computation Linguistics 2021 مقالة

We address the problem of enhancing model robustness through regularization. Specifically, we focus on methods that regularize the model posterior difference between clean and noisy inputs. Theoretically, we provide a connection of two recent methods , Jacobian Regularization and Virtual Adversarial Training, under this framework. Additionally, we generalize the posterior differential regularization to the family of f-divergences and characterize the overall framework in terms of the Jacobian matrix. Empirically, we compare those regularizations and standard BERT training on a diverse set of tasks to provide a comprehensive profile of their effect on model generalization. For both fully supervised and semi-supervised settings, we show that regularizing the posterior difference with f-divergence can result in well-improved model robustness. In particular, with a proper f-divergence, a BERT-base model can achieve comparable generalization as its BERT-large counterpart for in-domain, adversarial and domain shift scenarios, indicating the great potential of the proposed framework for enhancing NLP model robustness.

improving model robustness posterior differential regularization improving model تحسين نموذج المتانة التنظيم التفاضلي الخلفي تحسين نموذج صناعة حمض الفوسفور المزيد..

Improving Distantly Supervised Relation Extraction with Self-Ensemble Noise Filtering

594 - Association for Computation Linguistics 2021 مقالة

Distantly supervised models are very popular for relation extraction since we can obtain a large amount of training data using the distant supervision method without human annotation. In distant supervision, a sentence is considered as a source of a tuple if the sentence contains both entities of the tuple. However, this condition is too permissive and does not guarantee the presence of relevant relation-specific information in the sentence. As such, distantly supervised training data contains much noise which adversely affects the performance of the models. In this paper, we propose a self-ensemble filtering mechanism to filter out the noisy samples during the training process. We evaluate our proposed framework on the New York Times dataset which is obtained via distant supervision. Our experiments with multiple state-of-the-art neural relation extraction models show that our proposed filtering mechanism improves the robustness of the models and increases their F1 scores.

improving distantly supervised distantly supervised relation supervised relation extraction تحسين الإشراف المستمر العلاقة التي أشرفت بشكل مسيير استخراج العلاقة الإشراف صناعة حمض الفوسفور المزيد..

DuluthNLP at SemEval-2021 Task 7: Fine-Tuning RoBERTa Model for Humor Detection and Offense Rating

784 - Association for Computation Linguistics 2021 مقالة

This paper presents the DuluthNLP submission to Task 7 of the SemEval 2021 competition on Detecting and Rating Humor and Offense. In it, we explain the approach used to train the model together with the process of fine-tuning our model in getting the results. We focus on humor detection, rating, and of-fense rating, representing three out of the four subtasks that were provided. We show that optimizing hyper-parameters for learning rate, batch size and number of epochs can increase the accuracy and F1 score for humor detection

fine-tuning roberta model offense rating غرامة ضبط روبرتا التقييم الجريمة صناعة حمض الفوسفور

Fine-tuning Encoders for Improved Monolingual and Zero-shot Polylingual Neural Topic Modeling

721 - Association for Computation Linguistics 2021 مقالة

Neural topic models can augment or replace bag-of-words inputs with the learned representations of deep pre-trained transformer-based word prediction models. One added benefit when using representations from multilingual models is that they facilitat e zero-shot polylingual topic modeling. However, while it has been widely observed that pre-trained embeddings should be fine-tuned to a given task, it is not immediately clear what supervision should look like for an unsupervised task such as topic modeling. Thus, we propose several methods for fine-tuning encoders to improve both monolingual and zero-shot polylingual neural topic modeling. We consider fine-tuning on auxiliary tasks, constructing a new topic classification task, integrating the topic classification objective directly into topic model training, and continued pre-training. We find that fine-tuning encoder representations on topic classification and integrating the topic classification task directly into topic modeling improves topic quality, and that fine-tuning encoder representations on any task is the most important factor for facilitating cross-lingual transfer.

zero-shot polylingual neural polylingual neural topic neural topic modeling صفر بالرصاص البثق العصبي البوللينلينلينجلين العصبي الموضوع نمذجة الموضوع العصبي صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Noise Stability Regularization for Improving BERT Fine-tuning

تنظيم استقرار الضوضاء لتحسين Bert Nine-Tuning

Ask ChatGPT about the research

Read More

suggested questions