Certified Defenses for Adversarial Patches

98 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Ping-Yeh Chiang

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Ping-Yeh Chiang - Renkun Ni - Ahmed Abdelkader

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Adversarial patch attacks are among one of the most practical threat models against real-world computer vision systems. This paper studies certified and empirical defenses against patch attacks. We begin with a set of experiments showing that most existing defenses, which work by pre-processing input images to mitigate adversarial patches, are easily broken by simple white-box adversaries. Motivated by this finding, we propose the first certified defense against patch attacks, and propose faster methods for its training. Furthermore, we experiment with different patch shapes for testing, obtaining surprisingly good robustness transfer across shapes, and present preliminary results on certified defense against sparse attacks. Our complete implementation can be found on: https://github.com/Ping-C/certifiedpatchdefense.

قيم البحث

96 - Deqiang Li , Qianmu Li 2020

Malware remains a big threat to cyber security, calling for machine learning based malware detection. While promising, such detectors are known to be vulnerable to evasion attacks. Ensemble learning typically facilitates countermeasures, while attack ers can leverage this technique to improve attack effectiveness as well. This motivates us to investigate which kind of robustness the ensemble defense or effectiveness the ensemble attack can achieve, particularly when they combat with each other. We thus propose a new attack approach, named mixture of attacks, by rendering attackers capable of multiple generative methods and multiple manipulation sets, to perturb a malware example without ruining its malicious functionality. This naturally leads to a new instantiation of adversarial training, which is further geared to enhancing the ensemble of deep neural networks. We evaluate defenses using Android malware detectors against 26 different attacks upon two practical datasets. Experimental results show that the new adversarial training significantly enhances the robustness of deep neural networks against a wide range of attacks, ensemble methods promote the robustness when base classifiers are robust enough, and yet ensemble attacks can evade the enhanced malware detectors effectively, even notably downgrading the VirusTotal service.

التشفير والأمن التعلم الآلي التعلم الالي

Generative Adversarial Networks for Synthesizing InSAR Patches

97 - Philipp Sibler , Yuanyuan Wang , Stefan Auer 2020

Generative Adversarial Networks (GANs) have been employed with certain success for image translation tasks between optical and real-valued SAR intensity imagery. Applications include aiding interpretability of SAR scenes with their optical counterpar ts by artificial patch generation and automatic SAR-optical scene matching. The synthesis of artificial complex-valued InSAR image stacks asks for, besides good perceptual quality, more stringent quality metrics like phase noise and phase coherence. This paper provides a signal processing model of generative CNN structures, describes effects influencing those quality metrics and presents a mapping scheme of complex-valued data to given CNN structures based on popular Deep Learning frameworks.

معالجة الإشارات التعلم الآلي معالجة الصور والفيديو

Scalable Differential Privacy with Certified Robustness in Adversarial Learning

74 - NhatHai Phan , My T. Thai , Han Hu 2019

In this paper, we aim to develop a scalable algorithm to preserve differential privacy (DP) in adversarial learning for deep neural networks (DNNs), with certified robustness to adversarial examples. By leveraging the sequential composition theory in DP, we randomize both input and latent spaces to strengthen our certified robustness bounds. To address the trade-off among model utility, privacy loss, and robustness, we design an original adversarial objective function, based on the post-processing property in DP, to tighten the sensitivity of our model. A new stochastic batch training is proposed to apply our mechanism on large DNNs and datasets, by bypassing the vanilla iterative batch-by-batch training in DP DNNs. An end-to-end theoretical analysis and evaluations show that our mechanism notably improves the robustness and scalability of DP DNNs.

التشفير والأمن

Certified Robustness to Adversarial Word Substitutions

324 - Robin Jia , Aditi Raghunathan , Kerem Goksel 2019

State-of-the-art NLP models can often be fooled by adversaries that apply seemingly innocuous label-preserving transformations (e.g., paraphrasing) to input text. The number of possible transformations scales exponentially with text length, so data a ugmentation cannot cover all transformations of an input. This paper considers one exponentially large family of label-preserving transformations, in which every word in the input can be replaced with a similar word. We train the first models that are provably robust to all word substitutions in this family. Our training procedure uses Interval Bound Propagation (IBP) to minimize an upper bound on the worst-case loss that any combination of word substitutions can induce. To evaluate models robustness to these transformations, we measure accuracy on adversarially chosen word substitutions applied to test examples. Our IBP-trained models attain $75%$ adversarial accuracy on both sentiment analysis on IMDB and natural language inference on SNLI. In comparison, on IMDB, models trained normally and ones trained with data augmentation achieve adversarial accuracy of only $8%$ and $35%$, respectively.

الحساب واللغة التعلم الآلي

Intrusion Detection for Industrial Control Systems: Evaluation Analysis and Adversarial Attacks

318 - Giulio Zizzo , Chris Hankin , Sergio Maffeis 2019

Neural networks are increasingly used in security applications for intrusion detection on industrial control systems. In this work we examine two areas that must be considered for their effective use. Firstly, is their vulnerability to adversarial at tacks when used in a time series setting. Secondly, is potential over-estimation of performance arising from data leakage artefacts. To investigate these areas we implement a long short-term memory (LSTM) based intrusion detection system (IDS) which effectively detects cyber-physical attacks on a water treatment testbed representing a strong baseline IDS. For investigating adversarial attacks we model two different white box attackers. The first attacker is able to manipulate sensor readings on a subset of the Secure Water Treatment (SWaT) system. By creating a stream of adversarial data the attacker is able to hide the cyber-physical attacks from the IDS. For the cyber-physical attacks which are detected by the IDS, the attacker required on average 2.48 out of 12 total sensors to be compromised for the cyber-physical attacks to be hidden from the IDS. The second attacker model we explore is an $L_{infty}$ bounded attacker who can send fake readings to the IDS, but to remain imperceptible, limits their perturbations to the smallest $L_{infty}$ value needed. Additionally, we examine data leakage problems arising from tuning for $F_1$ score on the whole SWaT attack set and propose a method to tune detection parameters that does not utilise any attack data. If attack after-effects are accounted for then our new parameter tuning method achieved an $F_1$ score of 0.811$pm$0.0103.

التشفير والأمن التعلم الآلي التعلم الالي