ﻻ يوجد ملخص باللغة العربية
It has been proved that deep neural networks are facing a new threat called backdoor attacks, where the adversary can inject backdoors into the neural network model through poisoning the training dataset. When the input containing some special pattern called the backdoor trigger, the model with backdoor will carry out malicious task such as misclassification specified by adversaries. In text classification systems, backdoors inserted in the models can cause spam or malicious speech to escape detection. Previous work mainly focused on the defense of backdoor attacks in computer vision, little attention has been paid to defense method for RNN backdoor attacks regarding text classification. In this paper, through analyzing the changes in inner LSTM neurons, we proposed a defense method called Backdoor Keyword Identification (BKI) to mitigate backdoor attacks which the adversary performs against LSTM-based text classification by data poisoning. This method can identify and exclude poisoning samples crafted to insert backdoor into the model from training data without a verified and trusted dataset. We evaluate our method on four different text classification datset: IMDB, DBpedia ontology, 20 newsgroups and Reuters-21578 dataset. It all achieves good performance regardless of the trigger sentences.
Machine learning (ML) has progressed rapidly during the past decade and ML models have been deployed in various real-world applications. Meanwhile, machine learning models have been shown to be vulnerable to various security and privacy attacks. One
textbf{P}re-textbf{T}rained textbf{M}odeltextbf{s} have been widely applied and recently proved vulnerable under backdoor attacks: the released pre-trained weights can be maliciously poisoned with certain triggers. When the triggers are activated, ev
Federated learning enables thousands of participants to construct a deep learning model without sharing their private training data with each other. For example, multiple smartphones can jointly train a next-word predictor for keyboards without revea
Speaker verification has been widely and successfully adopted in many mission-critical areas for user identification. The training of speaker verification requires a large amount of data, therefore users usually need to adopt third-party data ($e.g.$
As collaborative learning allows joint training of a model using multiple sources of data, the security problem has been a central concern. Malicious users can upload poisoned data to prevent the models convergence or inject hidden backdoors. The so-