Why Adversarial Reprogramming Works, When It Fails, and How to Tell the Difference

270 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Ambra Demontis Ph.D.

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Yang Zheng - Xiaoyi Feng - Zhaoqiang Xia

التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Adversarial reprogramming allows repurposing a machine-learning model to perform a different task. For example, a model trained to recognize animals can be reprogrammed to recognize digits by embedding an adversarial program in the digit images provided as input. Recent work has shown that adversarial reprogramming may not only be used to abuse machine-learning models provided as a service, but also beneficially, to improve transfer learning when training data is scarce. However, the factors affecting its success are still largely unexplained. In this work, we develop a first-order linear model of adversarial reprogramming to show that its success inherently depends on the size of the average input gradient, which grows when input gradients are more aligned, and when inputs have higher dimensionality. The results of our experimental analysis, involving fourteen distinct reprogramming tasks, show that the above factors are correlated with the success and the failure of adversarial reprogramming.

قيم البحث

86 - Anthony Sicilia , Xingchen Zhao , Seong Jae Hwang 2021

Theoretically, domain adaptation is a well-researched problem. Further, this theory has been well-used in practice. In particular, we note the bound on target error given by Ben-David et al. (2010) and the well-known domain-aligning algorithm based o n this work using Domain Adversarial Neural Networks (DANN) presented by Ganin and Lempitsky (2015). Recently, multiple variants of DANN have been proposed for the related problem of domain generalization, but without much discussion of the original motivating bound. In this paper, we investigate the validity of DANN in domain generalization from this perspective. We investigate conditions under which application of DANN makes sense and further consider DANN as a dynamic process during training. Our investigation suggests that the application of DANN to domain generalization may not be as straightforward as it seems. To address this, we design an algorithmic extension to DANN in the domain generalization case. Our experimentation validates both theory and algorithm.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط

Why Indexing Works

81 - J. B. Heaton , N. G. Polson , J. H. Witte 2015

We develop a simple stock selection model to explain why active equity managers tend to underperform a benchmark index. We motivate our model with the empirical observation that the best performing stocks in a broad market index often perform much be tter than the other stocks in the index. Randomly selecting a subset of securities from the index may dramatically increase the chance of underperforming the index. The relative likelihood of underperformance by investors choosing active management likely is much more important than the loss to those same investors from the higher fees for active management relative to passive index investing. Thus, active management may be even more challenging than previously believed, and the stakes for finding the best active managers may be larger than previously assumed.

إدارة المحافظ التمويل الإحصائي

Larger-Context Tagging: When and Why Does It Work?

188 - Jinlan Fu , Liangjing Feng , Qi Zhang 2021

The development of neural networks and pretraining techniques has spawned many sentence-level tagging systems that achieved superior performance on typical benchmarks. However, a relatively less discussed topic is what if more context information is introduced into current top-scoring tagging systems. Although several existing works have attempted to shift tagging systems from sentence-level to document-level, there is still no consensus conclusion about when and why it works, which limits the applicability of the larger-context approach in tagging tasks. In this paper, instead of pursuing a state-of-the-art tagging system by architectural exploration, we focus on investigating when and why the larger-context training, as a general strategy, can work. To this end, we conduct a thorough comparative study on four proposed aggregators for context information collecting and present an attribute-aided evaluation method to interpret the improvement brought by larger-context training. Experimentally, we set up a testbed based on four tagging tasks and thirteen datasets. Hopefully, our preliminary observations can deepen the understanding of larger-context training and enlighten more follow-up works on the use of contextual information.

الحساب واللغة

Why to Decouple the Uplink and Downlink in Cellular Networks and How To Do It

158 - Federico Boccardi , Jeffrey Andrews , Hisham Elshaer 2015

Ever since the inception of mobile telephony, the downlink and uplink of cellular networks have been coupled, i.e. mobile terminals have been constrained to associate with the same base station (BS) in both the downlink and uplink directions. New tre nds in network densification and mobile data usage increase the drawbacks of this constraint, and suggest that it should be revisited. In this paper we identify and explain five key arguments in favor of Downlink/Uplink Decoupling (DUDe) based on a blend of theoretical, experimental, and logical arguments. We then overview the changes needed in current (LTE-A) mobile systems to enable this decoupling, and then look ahead to fifth generation (5G) cellular standards. We believe the introduced paradigm will lead to significant gains in network throughput, outage and power consumption at a much lower cost compared to other solutions providing comparable or lower gains.

بنية الشبكات والإنترنت

On Statistical Bias In Active Learning: How and When To Fix It

106 - Sebastian Farquhar , Yarin Gal , Tom Rainforth 2021

Active learning is a powerful tool when labelling data is expensive, but it introduces a bias because the training data no longer follows the population distribution. We formalize this bias and investigate the situations in which it can be harmful an d sometimes even helpful. We further introduce novel corrective weights to remove bias when doing so is beneficial. Through this, our work not only provides a useful mechanism that can improve the active learning approach, but also an explanation of the empirical successes of various existing approaches which ignore this bias. In particular, we show that this bias can be actively helpful when training overparameterized models -- like neural networks -- with relatively little data.

التعلم الالي التعلم الآلي