أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Sijia Liu

Sign-MAML: Efficient Model-Agnostic Meta-Learning by SignSGD

109 - Chen Fan , Parikshit Ram , Sijia Liu 2021

We propose a new computationally-efficient first-order algorithm for Model-Agnostic Meta-Learning (MAML). The key enabling technique is to interpret MAML as a bilevel optimization (BLO) problem and leverage the sign-based SGD(signSGD) as a lower-leve l optimizer of BLO. We show that MAML, through the lens of signSGD-oriented BLO, naturally yields an alternating optimization scheme that just requires first-order gradients of a learned meta-model. We term the resulting MAML algorithm Sign-MAML. Compared to the conventional first-order MAML (FO-MAML) algorithm, Sign-MAML is theoretically-grounded as it does not impose any assumption on the absence of second-order derivatives during meta training. In practice, we show that Sign-MAML outperforms FO-MAML in various few-shot image classification tasks, and compared to MAML, it achieves a much more graceful tradeoff between classification accuracy and computation efficiency.

التعلم الآلي الذكاء الاصطناعي

Certifiably Robust Interpretation via Renyi Differential Privacy

124 - Ao Liu , Xiaoyu Chen , Sijia Liu 2021

Motivated by the recent discovery that the interpretation maps of CNNs could easily be manipulated by adversarial attacks against network interpretability, we study the problem of interpretation robustness from a new perspective of Renyi differential privacy (RDP). The advantages of our Renyi-Robust-Smooth (RDP-based interpretation method) are three-folds. First, it can offer provable and certifiable top-$k$ robustness. That is, the top-$k$ important attributions of the interpretation map are provably robust under any input perturbation with bounded $ell_d$-norm (for any $dgeq 1$, including $d = infty$). Second, our proposed method offers $sim10%$ better experimental robustness than existing approaches in terms of the top-$k$ attributions. Remarkably, the accuracy of Renyi-Robust-Smooth also outperforms existing approaches. Third, our method can provide a smooth tradeoff between robustness and computational efficiency. Experimentally, its top-$k$ attributions are {em twice} more robust than existing approaches when the computational resources are highly constrained.

التعلم الآلي التعلم الالي

Preserving Earlier Knowledge in Continual Learning with the Help of All Previous Feature Extractors

154 - Zhuoyun Li , Changhong Zhong , Sijia Liu 2021

Continual learning of new knowledge over time is one desirable capability for intelligent systems to recognize more and more classes of objects. Without or with very limited amount of old data stored, an intelligent system often catastrophically forg ets previously learned old knowledge when learning new knowledge. Recently, various approaches have been proposed to alleviate the catastrophic forgetting issue. However, old knowledge learned earlier is commonly less preserved than that learned more recently. In order to reduce the forgetting of particularly earlier learned old knowledge and improve the overall continual learning performance, we propose a simple yet effective fusion mechanism by including all the previously learned feature extractors into the intelligent model. In addition, a new feature extractor is included to the model when learning a new set of classes each time, and a feature extractor pruning is also applied to prevent the whole model size from growing rapidly. Experiments on multiple classification tasks show that the proposed approach can effectively reduce the forgetting of old knowledge, achieving state-of-the-art continual learning performance.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Preserve, Promote, or Attack? GNN Explanation via Topology Perturbation

286 - Yi Sun , Abel Valente , Sijia Liu 2021

Prior works on formalizing explanations of a graph neural network (GNN) focus on a single use case - to preserve the prediction results through identifying important edges and nodes. In this paper, we develop a multi-purpose interpretation framework by acquiring a mask that indicates topology perturbations of the input graphs. We pack the framework into an interactive visualization system (GNNViz) which can fulfill multiple purposes: Preserve,Promote, or Attack GNNs predictions. We illustrate our approachs novelty and effectiveness with three case studies: First, GNNViz can assist non expert users to easily explore the relationship between graph topology and GNNs decision (Preserve), or to manipulate the prediction (Promote or Attack) for an image classification task on MS-COCO; Second, on the Pokec social network dataset, our framework can uncover unfairness and demographic biases; Lastly, it compares with state-of-the-art GNN explainer baseline on a synthetic dataset.

التعلم الآلي

Generating Adversarial Computer Programs using Optimized Obfuscations

113 - Shashank Srikant , Sijia Liu , Tamara Mitrovska 2021

Machine learning (ML) models that learn and predict properties of computer programs are increasingly being adopted and deployed. These models have demonstrated success in applications such as auto-completing code, summarizing large programs, and dete cting bugs and malware in programs. In this work, we investigate principled ways to adversarially perturb a computer program to fool such learned models, and thus determine their adversarial robustness. We use program obfuscations, which have conventionally been used to avoid attempts at reverse engineering programs, as adversarial perturbations. These perturbations modify programs in ways that do not alter their functionality but can be crafted to deceive an ML model when making a decision. We provide a general formulation for an adversarial program that allows applying multiple obfuscation transformations to a program in any language. We develop first-order optimization algorithms to efficiently determine two key aspects -- which parts of the program to transform, and what transformations to use. We show that it is important to optimize both these aspects to generate the best adversarially perturbed program. Due to the discrete nature of this problem, we also propose using randomized smoothing to improve the attack loss landscape to ease optimization. We evaluate our work on Python and Java programs on the problem of program summarization. We show that our best attack proposal achieves a $52%$ improvement over a state-of-the-art attack generation approach for programs trained on a seq2seq model. We further show that our formulation is better at training models that are robust to adversarial attacks.

التعلم الآلي لغات البرمجة هندسة البرمجيات

On Fast Adversarial Robustness Adaptation in Model-Agnostic Meta-Learning

125 - Ren Wang , Kaidi Xu , Sijia Liu 2021

Model-agnostic meta-learning (MAML) has emerged as one of the most successful meta-learning techniques in few-shot learning. It enables us to learn a meta-initialization} of model parameters (that we call meta-model) to rapidly adapt to new tasks usi ng a small amount of labeled training data. Despite the generalization power of the meta-model, it remains elusive that how adversarial robustness can be maintained by MAML in few-shot learning. In addition to generalization, robustness is also desired for a meta-model to defend adversarial examples (attacks). Toward promoting adversarial robustness in MAML, we first study WHEN a robustness-promoting regularization should be incorporated, given the fact that MAML adopts a bi-level (fine-tuning vs. meta-update) learning procedure. We show that robustifying the meta-update stage is sufficient to make robustness adapted to the task-specific fine-tuning stage even if the latter uses a standard training protocol. We also make additional justification on the acquired robustness adaptation by peering into the interpretability of neurons activation maps. Furthermore, we investigate HOW robust regularization can efficiently be designed in MAML. We propose a general but easily-optimized robustness-regularized meta-learning framework, which allows the use of unlabeled data augmentation, fast adversarial attack generation, and computationally-light fine-tuning. In particular, we for the first time show that the auxiliary contrastive learning task can enhance the adversarial robustness of MAML. Finally, extensive experiments are conducted to demonstrate the effectiveness of our proposed methods in robust few-shot learning.

التعلم الآلي الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط

Fast Training of Provably Robust Neural Networks by SingleProp

93 - Akhilan Boopathy , Tsui-Wei Weng , Sijia Liu 2021

Recent works have developed several methods of defending neural networks against adversarial attacks with certified guarantees. However, these techniques can be computationally costly due to the use of certification during training. We develop a new regularizer that is both more efficient than existing certified defenses, requiring only one additional forward propagation through a network, and can be used to train networks with similar certified accuracy. Through experiments on MNIST and CIFAR-10 we demonstrate improvements in training speed and comparable certified accuracy compared to state-of-the-art certified defenses.

التعلم الآلي التعلم الالي

Self-Progressing Robust Training

55 - Minhao Cheng , Pin-Yu Chen , Sijia Liu 2020

Enhancing model robustness under new and even adversarial environments is a crucial milestone toward building trustworthy machine learning systems. Current robust training methods such as adversarial training explicitly uses an attack (e.g., $ell_{in fty}$-norm bounded perturbation) to generate adversarial examples during model training for improving adversarial robustness. In this paper, we take a different perspective and propose a new framework called SPROUT, self-progressing robust training. During model training, SPROUT progressively adjusts training label distribution via our proposed parametrized label smoothing technique, making training free of attack generation and more scalable. We also motivate SPROUT using a general formulation based on vicinity risk minimization, which includes many robust training methods as special cases. Compared with state-of-the-art adversarial training methods (PGD-l_inf and TRADES) under l_inf-norm bounded attacks and various invariance tests, SPROUT consistently attains superior performance and is more scalable to large neural networks. Our results shed new light on scalable, effective and attack-independent robust training methods.

التعلم الآلي الذكاء الاصطناعي

Learned Fine-Tuner for Incongruous Few-Shot Adversarial Learning

99 - Pu Zhao , Sijia Liu , Parikshit Ram 2020

Model-agnostic meta-learning (MAML) effectively meta-learns an initialization of model parameters for few-shot learning where all learning problems share the same format of model parameters -- congruous meta-learning. However, there are few-shot lear ning scenarios, such as adversarial attack design, where different yet related few-shot learning problems may not share any optimizee variables, necessitating incongruous meta-learning. We extend MAML to this setting -- a Learned Fine Tuner (LFT) is used to replace hand-designed optimizers (such as SGD) for the task-specific fine-tuning. Here, MAML instead meta-learns the parameters of this LFT across incongruous tasks leveraging the learning-to-optimize (L2O) framework such that models fine-tuned with LFT (even from random initializations) adapt quickly to new tasks. As novel contributions, we show that the use of LFT within MAML (i) offers the capability to tackle few-shot learning tasks by meta-learning across incongruous yet related problems and (ii) can efficiently work with first-order and derivative-free few-shot learning problems. Theoretically, we quantify the difference between LFT (for MAML) and L2O. Empirically, we demonstrate the effectiveness of LFT through a novel application of generating universal adversarial attacks across different image sources and sizes in the few-shot learning regime.

التعلم الآلي الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط

Practical Detection of Trojan Neural Networks: Data-Limited and Data-Free Cases

131 - Ren Wang , Gaoyuan Zhang , Sijia Liu 2020

When the training data are maliciously tampered, the predictions of the acquired deep neural network (DNN) can be manipulated by an adversary known as the Trojan attack (or poisoning backdoor attack). The lack of robustness of DNNs against Trojan att acks could significantly harm real-life machine learning (ML) systems in downstream applications, therefore posing widespread concern to their trustworthiness. In this paper, we study the problem of the Trojan network (TrojanNet) detection in the data-scarce regime, where only the weights of a trained DNN are accessed by the detector. We first propose a data-limited TrojanNet detector (TND), when only a few data samples are available for TrojanNet detection. We show that an effective data-limited TND can be established by exploring connections between Trojan attack and prediction-evasion adversarial attacks including per-sample attack as well as all-sample universal attack. In addition, we propose a data-free TND, which can detect a TrojanNet without accessing any data samples. We show that such a TND can be built by leveraging the internal response of hidden neurons, which exhibits the Trojan behavior even at random noise inputs. The effectiveness of our proposals is evaluated by extensive experiments under different model architectures and datasets including CIFAR-10, GTSRB, and ImageNet.

التعلم الآلي التشفير والأمن التعلم الالي

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد