Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Adversarially Trained Models with Test-Time Covariate Shift Adaptation

103 0 0.0 ( 0 )

Download Cite

Added by Jay Nandy

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Jay Nandy - Sudipan Saha - Wynne Hsu

Machine Learning Artificial Intelligence

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We empirically demonstrate that test-time adaptive batch normalization, which re-estimates the batch-normalization statistics during inference, can provide $ell_2$-certification as well as improve the commonly occurring corruption robustness of adversarially trained models while maintaining their state-of-the-art empirical robustness against adversarial attacks. Furthermore, we obtain similar $ell_2$-certification as the current state-of-the-art certification models for CIFAR-10 by learning our adversarially trained model using larger $ell_2$-bounded adversaries. Therefore our work is a step towards bridging the gap between the state-of-the-art certification and empirical robustness. Our results also indicate that improving the empirical adversarial robustness may be sufficient as we achieve certification and corruption robustness as a by-product using test-time adaptive batch normalization.

rate research

Effective Unsupervised Domain Adaptation with Adversarially Trained Language Models

123 - Thuy-Trang Vu , Dinh Phung , Gholamreza Haffari 2020

Recent work has shown the importance of adaptation of broad-coverage contextualised embedding models on the domain of the target task of interest. Current self-supervised adaptation methods are simplistic, as the training signal comes from a small percentage of emph{randomly} masked-out tokens. In this paper, we show that careful masking strategies can bridge the knowledge gap of masked language models (MLMs) about the domains more effectively by allocating self-supervision where it is needed. Furthermore, we propose an effective training strategy by adversarially masking out those tokens which are harder to reconstruct by the underlying MLM. The adversarial objective leads to a challenging combinatorial optimisation problem over emph{subsets} of tokens, which we tackle efficiently through relaxation to a variational lowerbound and dynamic programming. On six unsupervised domain adaptation tasks involving named entity recognition, our method strongly outperforms the random masking strategy and achieves up to +1.64 F1 score improvements.

Computation and Language

Smoothed Inference for Adversarially-Trained Models

88 - Yaniv Nemcovsky , Evgenii Zheltonozhskii , Chaim Baskin 2019

Deep neural networks are known to be vulnerable to adversarial attacks. Current methods of defense from such attacks are based on either implicit or explicit regularization, e.g., adversarial training. Randomized smoothing, the averaging of the classifier outputs over a random distribution centered in the sample, has been shown to guarantee the performance of a classifier subject to bounded perturbations of the input. In this work, we study the application of randomized smoothing as a way to improve performance on unperturbed data as well as to increase robustness to adversarial attacks. The proposed technique can be applied on top of any existing adversarial defense, but works particularly well with the randomized approaches. We examine its performance on common white-box (PGD) and black-box (transfer and NAttack) attacks on CIFAR-10 and CIFAR-100, substantially outperforming previous art for most scenarios and comparable on others. For example, we achieve 60.4% accuracy under a PGD attack on CIFAR-10 using ResNet-20, outperforming previous art by 11.7%. Since our method is based on sampling, it lends itself well for trading-off between the model inference complexity and its performance. A reference implementation of the proposed techniques is provided at https://github.com/yanemcovsky/SIAM

Machine Learning Computer Vision and Pattern Recognition Machine Learning

Evaluating Prediction-Time Batch Normalization for Robustness under Covariate Shift

125 - Zachary Nado , Shreyas Padhy , D. Sculley 2020

Covariate shift has been shown to sharply degrade both predictive accuracy and the calibration of uncertainty estimates for deep learning models. This is worrying, because covariate shift is prevalent in a wide range of real world deployment settings. However, in this paper, we note that frequently there exists the potential to access small unlabeled batches of the shifted data just before prediction time. This interesting observation enables a simple but surprisingly effective method which we call prediction-time batch normalization, which significantly improves model accuracy and calibration under covariate shift. Using this one line code change, we achieve state-of-the-art on recent covariate shift benchmarks and an mCE of 60.28% on the challenging ImageNet-C dataset; to our knowledge, this is the best result for any model that does not incorporate additional data augmentation or modification of the training pipeline. We show that prediction-time batch normalization provides complementary benefits to existing state-of-the-art approaches for improving robustness (e.g. deep ensembles) and combining the two further improves performance. Our findings are supported by detailed measurements of the effect of this strategy on model behavior across rigorous ablations on various dataset modalities. However, the method has mixed results when used alongside pre-training, and does not seem to perform as well under more natural types of dataset shift, and is therefore worthy of additional study. We include links to the data in our figures to improve reproducibility, including a Python notebooks that can be run to easily modify our analysis at https://colab.research.google.com/drive/11N0wDZnMQQuLrRwRoumDCrhSaIhkqjof.

Machine Learning Machine Learning

Robust Importance Weighting for Covariate Shift

135 - Henry Lam , Fengpei Li , Siddharth Prusty 2019

In many learning problems, the training and testing data follow different distributions and a particularly common situation is the textit{covariate shift}. To correct for sampling biases, most approaches, including the popular kernel mean matching (KMM), focus on estimating the importance weights between the two distributions. Reweighting-based methods, however, are exposed to high variance when the distributional discrepancy is large and the weights are poorly estimated. On the other hand, the alternate approach of using nonparametric regression (NR) incurs high bias when the training size is limited. In this paper, we propose and analyze a new estimator that systematically integrates the residuals of NR with KMM reweighting, based on a control-variate perspective. The proposed estimator can be shown to either strictly outperform or match the best-known existing rates for both KMM and NR, and thus is a robust combination of both estimators. The experiments shows the estimator works well in practice.

Machine Learning Statistics Theory Machine Learning

Fairness Violations and Mitigation under Covariate Shift

255 - Harvineet Singh , Rina Singh , Vishwali Mhasawade 2019

We study the problem of learning fair prediction models for unseen test sets distributed differently from the train set. Stability against changes in data distribution is an important mandate for responsible deployment of models. The domain adaptation literature addresses this concern, albeit with the notion of stability limited to that of prediction accuracy. We identify sufficient conditions under which stable models, both in terms of prediction accuracy and fairness, can be learned. Using the causal graph describing the data and the anticipated shifts, we specify an approach based on feature selection that exploits conditional independencies in the data to estimate accuracy and fairness metrics for the test set. We show that for specific fairness definitions, the resulting model satisfies a form of worst-case optimality. In context of a healthcare task, we illustrate the advantages of the approach in making more equitable decisions.

Machine Learning Computers and Society Machine Learning

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Adversarially Trained Models with Test-Time Covariate Shift Adaptation

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions