Improving Gradient-based Adversarial Training for Text Classification by Contrastive Learning and Auto-Encoder

381 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Yao Qiu

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Yao Qiu - Jinchao Zhang - Jie Zhou

الحساب واللغة

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Recent work has proposed several efficient approaches for generating gradient-based adversarial perturbations on embeddings and proved that the models performance and robustness can be improved when they are trained with these contaminated embeddings. While they paid little attention to how to help the model to learn these adversarial samples more efficiently. In this work, we focus on enhancing the models ability to defend gradient-based adversarial attack during the models training process and propose two novel adversarial training approaches: (1) CARL narrows the original sample and its adversarial sample in the representation space while enlarging their distance from different labeled samples. (2) RAR forces the model to reconstruct the original sample from its adversarial representation. Experiments show that the proposed two approaches outperform strong baselines on various text classification datasets. Analysis experiments find that when using our approaches, the semantic representation of the input sentence wont be significantly affected by adversarial perturbations, and the models performance drops less under adversarial attack. That is to say, our approaches can effectively improve the robustness of the model. Besides, RAR can also be used to generate text-form adversarial samples.

قيم البحث

175 - Xin Dong , Yaxin Zhu , Yupeng Zhang 2020

In cross-lingual text classification, one seeks to exploit labeled data from one language to train a text classification model that can then be applied to a completely different language. Recent multilingual representation models have made it much ea sier to achieve this. Still, there may still be subtle differences between languages that are neglected when doing so. To address this, we present a semi-supervised adversarial training process that minimizes the maximal loss for label-preserving input perturbations. The resulting model then serves as a teacher to induce labels for unlabeled target language samples that can be used during further adversarial training, allowing us to gradually adapt our model to the target language. Compared with a number of strong baselines, we observe significant gains in effectiveness on document and intent classification for a diverse set of languages.

الحساب واللغة

Learning to Confuse: Generating Training Time Adversarial Data with Auto-Encoder

116 - Ji Feng , Qi-Zhi Cai , Zhi-Hua Zhou 2019

In this work, we consider one challenging training time attack by modifying training data with bounded perturbation, hoping to manipulate the behavior (both targeted or non-targeted) of any corresponding trained classifier during test time when facin g clean samples. To achieve this, we proposed to use an auto-encoder-like network to generate the pertubation on the training data paired with one differentiable system acting as the imaginary victim classifier. The perturbation generator will learn to update its weights by watching the training procedure of the imaginary classifier in order to produce the most harmful and imperceivable noise which in turn will lead the lowest generalization power for the victim classifier. This can be formulated into a non-linear equality constrained optimization problem. Unlike GANs, solving such problem is computationally challenging, we then proposed a simple yet effective procedure to decouple the alternating updates for the two networks for stability. The method proposed in this paper can be easily extended to the label specific setting where the attacker can manipulate the predictions of the victim classifiers according to some predefined rules rather than only making wrong predictions. Experiments on various datasets including CIFAR-10 and a reduced version of ImageNet confirmed the effectiveness of the proposed method and empirical results showed that, such bounded perturbation have good transferability regardless of which classifier the victim is actually using on image data.

التعلم الآلي التعلم الالي

BAE: BERT-based Adversarial Examples for Text Classification

282 - Siddhant Garg , Goutham Ramakrishnan 2020

Modern text classification models are susceptible to adversarial examples, perturb

الحساب واللغة

Gradient-based Adversarial Attacks against Text Transformers

226 - Chuan Guo , Alexandre Sablayrolles , Herve Jegou 2021

We propose the first general-purpose gradient-based attack against transformer models. Instead of searching for a single adversarial example, we search for a distribution of adversarial examples parameterized by a continuous-valued matrix, hence enab ling gradient-based optimization. We empirically demonstrate that our white-box attack attains state-of-the-art attack performance on a variety of natural language tasks. Furthermore, we show that a powerful black-box transfer attack, enabled by sampling from the adversarial distribution, matches or exceeds existing methods, while only requiring hard-label outputs.

الحساب واللغة الذكاء الاصطناعي التشفير والأمن

Bi-Adversarial Auto-Encoder for Zero-Shot Learning

65 - Yunlong Yu , Zhong Ji , Yanwei Pang 2018

Existing generative Zero-Shot Learning (ZSL) methods only consider the unidirectional alignment from the class semantics to the visual features while ignoring the alignment from the visual features to the class semantics, which fails to construct the visual-semantic interactions well. In this paper, we propose to synthesize visual features based on an auto-encoder framework paired with bi-adversarial networks respectively for visual and semantic modalities to reinforce the visual-semantic interactions with a bi-directional alignment, which ensures the synthesized visual features to fit the real visual distribution and to be highly related to the semantics. The encoder aims at synthesizing real-like visual features while the decoder forces both the real and the synthesized visual features to be more related to the class semantics. To further capture the discriminative information of the synthesized visual features, both the real and synthesized visual features are forced to be classified into the correct classes via a classification network. Experimental results on four benchmark datasets show that the proposed approach is particularly competitive on both the traditional ZSL and the generalized ZSL tasks.

الرؤية الحاسوبية وتمييز الأنماط

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الشرق الأوسط - الأردن

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Improving Gradient-based Adversarial Training for Text Classification by Contrastive Learning and Auto-Encoder

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً