ترغب بنشر مسار تعليمي؟ اضغط هنا

Poisoned classifiers are not only backdoored, they are fundamentally broken

396   0   0.0 ( 0 )
 نشر من قبل Mingjie Sun
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Under a commonly-studied backdoor poisoning attack against classification models, an attacker adds a small trigger to a subset of the training data, such that the presence of this trigger at test time causes the classifier to always predict some target class. It is often implicitly assumed that the poisoned classifier is vulnerable exclusively to the adversary who possesses the trigger. In this paper, we show empirically that this view of backdoored classifiers is fundamentally incorrect. We demonstrate that anyone with access to the classifier, even without access to any original training data or trigger, can construct several alternative triggers that are as effective or more so at eliciting the target class at test time. We construct these alternative triggers by first generating adversarial examples for a smoothed version of the classifier, created with a recent process called Denoised Smoothing, and then extracting colors or cropped portions of adversarial images. We demonstrate the effectiveness of our attack through extensive experiments on ImageNet and TrojAI datasets, including a user study which demonstrates that our method allows users to easily determine the existence of such backdoors in existing poisoned classifiers. Furthermore, we demonstrate that our alternative triggers can in fact look entirely different from the original trigger, highlighting that the backdoor actually learned by the classifier differs substantially from the trigger image itself. Thus, we argue that there is no such thing as a secret backdoor in poisoned classifiers: poisoning a classifier invites attacks not just by the party that possesses the trigger, but from anyone with access to the classifier. Code is available at https://github.com/locuslab/breaking-poisoned-classifier.



قيم البحث

اقرأ أيضاً

Understanding the links between the activity of supermassive black holes (SMBH) at the centres of galaxies and their host dark matter haloes is a key question in modern astrophysics. The final data release of the SDSS-IV eBOSS provides the largest co ntemporary spectroscopic sample of galaxies and QSOs. Using this sample and covering the redshift interval $z=0.7-1.1$, we have measured the clustering properties of the eBOSS QSOs, Emission Line Galaxies (ELGs) and Luminous Red Galaxies (LRGs). We have also measured the fraction of QSOs as a function of the overdensity defined by the galaxy population. Using these measurements, we investigate how QSOs populate and sample the galaxy population, and how the host dark-matter haloes of QSOs sample the underlying halo distribution. We find that the probability of a galaxy hosting a QSO is independent of the host dark matter halo mass of the galaxy. We also find that about 60% of eBOSS QSOs are hosted by LRGs and about 20-40% of QSOs are hosted by satellite galaxies. We find a slight preference for QSOs to populate satellite galaxies over central galaxies. This is connected to the host halo mass distribution of different types of galaxies. Based on our analysis, QSOs should be hosted by a very broad distribution of haloes, and their occurrence should be modulated only by the efficiency of galaxy formation processes.
The finding that massive galaxies grow with cosmic time fired the starting gun for the search of objects which could have survived up to the present day without suffering substantial changes (neither in their structures, neither in their stellar popu lations). Nevertheless, and despite the community efforts, up to now only one firm candidate to be considered one of these relics is known: NGC 1277. Curiously, this galaxy is located at the centre of one of the most rich near galaxy clusters: Perseus. Is its location a matter of chance? Should relic hunters focus their search on galaxy clusters? In order to reply this question, we have performed a simultaneous and analogous analysis using simulations (Millennium I-WMAP7) and observations (New York University Value-Added Galaxy Catalogue). Our results in both frameworks agree: it is more probable to find relics in high density environments.
Invariance to geometric transformations is a highly desirable property of automatic classifiers in many image recognition tasks. Nevertheless, it is unclear to which extent state-of-the-art classifiers are invariant to basic transformations such as r otations and translations. This is mainly due to the lack of general methods that properly measure such an invariance. In this paper, we propose a rigorous and systematic approach for quantifying the invariance to geometric transformations of any classifier. Our key idea is to cast the problem of assessing a classifiers invariance as the computation of geodesics along the manifold of transformed images. We propose the Manitest method, built on the efficient Fast Marching algorithm to compute the invariance of classifiers. Our new method quantifies in particular the importance of data augmentation for learning invariance from data, and the increased invariance of convolutional neural networks with depth. We foresee that the proposed generic tool for measuring invariance to a large class of geometric transformations and arbitrary classifiers will have many applications for evaluating and comparing classifiers based on their invariance, and help improving the invariance of existing classifiers.
201 - Jesse Johnson 2018
In order to choose a neural network architecture that will be effective for a particular modeling problem, one must understand the limitations imposed by each of the potential options. These limitations are typically described in terms of information theoretic bounds, or by comparing the relative complexity needed to approximate example functions between different architectures. In this paper, we examine the topological constraints that the architecture of a neural network imposes on the level sets of all the functions that it is able to approximate. This approach is novel for both the nature of the limitations and the fact that they are independent of network depth for a broad family of activation functions.
498 - Russell K. Standish 2013
Anthropic reasoning is a form of statistical reasoning based upon finding oneself a member of a particular reference class of conscious beings. By considering empirical distribution functions defined over animal life on Earth, we can deduce that the vast bulk of animal life is unlikely to be conscious.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا