New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Profanity-Avoiding Training Framework for Seq2seq Models with Certified Robustness

الإطار التدريبي - تجنب الألفاظ النابية لنماذج SEQ2SeQ مع متانة معتمدة

385 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

profanity-avoiding training framework training framework إطار التدريب على الألفاظ النابية إطار التدريب صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Seq2seq models have demonstrated their incredible effectiveness in a large variety of applications. However, recent research has shown that inappropriate language in training samples and well-designed testing cases can induce seq2seq models to output profanity. These outputs may potentially hurt the usability of seq2seq models and make the end-users feel offended. To address this problem, we propose a training framework with certified robustness to eliminate the causes that trigger the generation of profanity. The proposed training framework leverages merely a short list of profanity examples to prevent seq2seq models from generating a broader spectrum of profanity. The framework is composed of a pattern-eliminating training component to suppress the impact of language patterns with profanity in the training set, and a trigger-resisting training component to provide certified robustness for seq2seq models against intentionally injected profanity-triggering expressions in test samples. In the experiments, we consider two representative NLP tasks that seq2seq can be applied to, i.e., style transfer and dialogue generation. Extensive experimental results show that the proposed training framework can successfully prevent the NLP models from generating profanity.

References used

https://aclanthology.org/

rate research

Certified Robustness to Word Substitution Attack with Differential Privacy

434 - Association for Computation Linguistics 2021 مقالة

The robustness and security of natural language processing (NLP) models are significantly important in real-world applications. In the context of text classification tasks, adversarial examples can be designed by substituting words with synonyms unde r certain semantic and syntactic constraints, such that a well-trained model will give a wrong prediction. Therefore, it is crucial to develop techniques to provide a rigorous and provable robustness guarantee against such attacks. In this paper, we propose WordDP to achieve certified robustness against word substitution at- tacks in text classification via differential privacy (DP). We establish the connection between DP and adversarial robustness for the first time in the text domain and propose a conceptual exponential mechanism-based algorithm to formally achieve the robustness. We further present a practical simulated exponential mechanism that has efficient inference with certified robustness. We not only provide a rigorous analytic derivation of the certified condition but also experimentally compare the utility of WordDP with existing defense algorithms. The results show that WordDP achieves higher accuracy and more than 30X efficiency improvement over the state-of-the-art certified robustness mechanism in typical text classification tasks.

word substitution attack differential privacy robustness هجوم استبدال كلمة الخصوصية التفاضلية متانة صناعة حمض الفوسفور المزيد..

Certified Robustness to Programmable Transformations in LSTMs

307 - Association for Computation Linguistics 2021 مقالة

Deep neural networks for natural language processing are fragile in the face of adversarial examples---small input perturbations, like synonym substitution or word duplication, which cause a neural network to change its prediction. We present an appr oach to certifying the robustness of LSTMs (and extensions of LSTMs) and training models that can be efficiently certified. Our approach can certify robustness to intractably large perturbation spaces defined programmatically in a language of string transformations. Our evaluation shows that (1) our approach can train models that are more robust to combinations of string transformations than those produced using existing techniques; (2) our approach can show high certification accuracy of the resulting models.

programmable transformations robustness to programmable programmable التحولات القابلة للبرمجة متانة للبرمجة برمجة صناعة حمض الفوسفور المزيد..

Modeling Profanity and Hate Speech in Social Media with Semantic Subspaces

785 - Association for Computation Linguistics 2021 مقالة

Hate speech and profanity detection suffer from data sparsity, especially for languages other than English, due to the subjective nature of the tasks and the resulting annotation incompatibility of existing corpora. In this study, we identify profane subspaces in word and sentence representations and explore their generalization capability on a variety of similar and distant target tasks in a zero-shot setting. This is done monolingually (German) and cross-lingually to closely-related (English), distantly-related (French) and non-related (Arabic) tasks. We observe that, on both similar and distant target tasks and across all languages, the subspace-based representations transfer more effectively than standard BERT representations in the zero-shot setting, with improvements between F1 +10.9 and F1 +42.9 over the baselines across all tested monolingual and cross-lingual scenarios.

كشف اللغة media with semantic hate speech وسائل الإعلام مع الدلالية خطاب الكراهية صناعة حمض الفوسفور

BioCopy: A Plug-And-Play Span Copy Mechanism in Seq2Seq Models

236 - Association for Computation Linguistics 2021 مقالة

Copy mechanisms explicitly obtain unchanged tokens from the source (input) sequence to generate the target (output) sequence under the neural seq2seq framework. However, most of the existing copy mechanisms only consider single word copying from the source sentences, which results in losing essential tokens while copying long spans. In this work, we propose a plug-and-play architecture, namely BioCopy, to alleviate the problem aforementioned. Specifically, in the training stage, we construct a BIO tag for each token and train the original model with BIO tags jointly. In the inference stage, the model will firstly predict the BIO tag at each time step, then conduct different mask strategies based on the predicted BIO label to diminish the scope of the probability distributions over the vocabulary list. Experimental results on two separate generative tasks show that they all outperform the baseline models by adding our BioCopy to the original model structure.

span copy mechanism copy mechanisms explicitly copy mechanisms سبان نسخة آلية نسخ آليات صراحة نسخ آليات صناعة حمض الفوسفور المزيد..

Smoothing and Shrinking the Sparse Seq2Seq Search Space

397 - Association for Computation Linguistics 2021 مقالة

Current sequence-to-sequence models are trained to minimize cross-entropy and use softmax to compute the locally normalized probabilities over target sequences. While this setup has led to strong results in a variety of tasks, one unsatisfying aspect is its length bias: models give high scores to short, inadequate hypotheses and often make the empty string the argmax---the so-called cat got your tongue problem. Recently proposed entmax-based sparse sequence-to-sequence models present a possible solution, since they can shrink the search space by assigning zero probability to bad hypotheses, but their ability to handle word-level tasks with transformers has never been tested. In this work, we show that entmax-based models effectively solve the cat got your tongue problem, removing a major source of model error for neural machine translation. In addition, we generalize label smoothing, a critical regularization technique, to the broader family of Fenchel-Young losses, which includes both cross-entropy and the entmax losses. Our resulting label-smoothed entmax loss models set a new state of the art on multilingual grapheme-to-phoneme conversion and deliver improvements and better calibration properties on cross-lingual morphological inflection and machine translation for 7 language pairs.

search space shrinking the sparse smoothing and shrinking مساحة البحث تقلص المتفرق تجانس وتقلص صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Profanity-Avoiding Training Framework for Seq2seq Models with Certified Robustness

الإطار التدريبي - تجنب الألفاظ النابية لنماذج SEQ2SeQ مع متانة معتمدة

Ask ChatGPT about the research

Read More

suggested questions