Black-box Optimizer with Implicit Natural Gradient

84 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Yueming Lyu

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية الاحصاء الرياضي

والبحث باللغة English

تأليف Yueming Lyu - Ivor W. Tsang

التعلم الآلي التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Black-box optimization is primarily important for many compute-intensive applications, including reinforcement learning (RL), robot control, etc. This paper presents a novel theoretical framework for black-box optimization, in which our method performs stochastic update with the implicit natural gradient of an exponential-family distribution. Theoretically, we prove the convergence rate of our framework with full matrix update for convex functions. Our theoretical results also hold for continuous non-differentiable black-box functions. Our methods are very simple and contain less hyper-parameters than CMA-ES cite{hansen2006cma}. Empirically, our method with full matrix update achieves competitive performance compared with one of the state-of-the-art method CMA-ES on benchmark test problems. Moreover, our methods can achieve high optimization precision on some challenging test functions (e.g., $l_1$-norm ellipsoid test problem and Levy test problem), while methods with explicit natural gradient, i.e., IGO cite{ollivier2017information} with full matrix update can not. This shows the efficiency of our methods.

قيم البحث

75 - Kartik Chandra , Erik Meijer , Samantha Andow 2019

Working with any gradient-based machine learning algorithm involves the tedious task of tuning the optimizers hyperparameters, such as the learning rate. There exist many techniques for automated hyperparameter optimization, but they typically introd uce even more hyperparameters to control the hyperparameter optimization process. We propose to instead learn the hyperparameters themselves by gradient descent, and furthermore to learn the hyper-hyperparameters by gradient descent as well, and so on ad infinitum. As these towers of gradient-based optimizers grow, they become significantly less sensitive to the choice of top-level hyperparameters, hence decreasing the burden on the user to search for optimal values.

التعلم الآلي التعلم الالي

GenAttack: Practical Black-box Attacks with Gradient-Free Optimization

73 - Moustafa Alzantot , Yash Sharma , Supriyo Chakraborty 2018

Deep neural networks are vulnerable to adversarial examples, even in the black-box setting, where the attacker is restricted solely to query access. Existing black-box approaches to generating adversarial examples typically require a significant numb er of queries, either for training a substitute network or performing gradient estimation. We introduce GenAttack, a gradient-free optimization technique that uses genetic algorithms for synthesizing adversarial examples in the black-box setting. Our experiments on different datasets (MNIST, CIFAR-10, and ImageNet) show that GenAttack can successfully generate visually imperceptible adversarial examples against state-of-the-art image recognition models with orders of magnitude fewer queries than previous approaches. Against MNIST and CIFAR-10 models, GenAttack required roughly 2,126 and 2,568 times fewer queries respectively, than ZOO, the prior state-of-the-art black-box attack. In order to scale up the attack to large-scale high-dimensional ImageNet models, we perform a series of optimizations that further improve the query efficiency of our attack leading to 237 times fewer queries against the Inception-v3 model than ZOO. Furthermore, we show that GenAttack can successfully attack some state-of-the-art ImageNet defenses, including ensemble adversarial training and non-differentiable or randomized input transformations. Our results suggest that evolutionary algorithms open up a promising area of research into effective black-box attacks.

التعلم الآلي الذكاء الاصطناعي التشفير والأمن

Optimization of Graph Neural Networks with Natural Gradient Descent

132 - Mohammad Rasool Izadi , Yihao Fang , Robert Stevenson 2020

In this work, we propose to employ information-geometric tools to optimize a graph neural network architecture such as the graph convolutional networks. More specifically, we develop optimization algorithms for the graph-based semi-supervised learnin g by employing the natural gradient information in the optimization process. This allows us to efficiently exploit the geometry of the underlying statistical model or parameter space for optimization and inference. To the best of our knowledge, this is the first work that has utilized the natural gradient for the optimization of graph neural networks that can be extended to other semi-supervised problems. Efficient computations algorithms are developed and extensive numerical studies are conducted to demonstrate the superior performance of our algorithms over existing algorithms such as ADAM and SGD.

التعلم الآلي التعلم الالي

Optimizing Black-box Metrics with Iterative Example Weighting

90 - Gaurush Hiranandani , Jatin Mathur , Harikrishna Narasimhan 2021

We consider learning to optimize a classification metric defined by a black-box function of the confusion matrix. Such black-box learning settings are ubiquitous, for example, when the learner only has query access to the metric of interest, or in no isy-label and domain adaptation applications where the learner must evaluate the metric via performance evaluation using a small validation sample. Our approach is to adaptively learn example weights on the training dataset such that the resulting weighted objective best approximates the metric on the validation sample. We show how to model and estimate the example weights and use them to iteratively post-shift a pre-trained class probability estimator to construct a classifier. We also analyze the resulting procedures statistical properties. Experiments on various label noise, domain shift, and fair classification setups confirm that our proposal compares favorably to the state-of-the-art baselines for each application.

التعلم الآلي التعلم الالي

Noisy Natural Gradient as Variational Inference

71 - Guodong Zhang , Shengyang Sun , David Duvenaud 2017

Variational Bayesian neural nets combine the flexibility of deep learning with Bayesian uncertainty estimation. Unfortunately, there is a tradeoff between cheap but simple variational families (e.g.~fully factorized) or expensive and complicated infe rence procedures. We show that natural gradient ascent with adaptive weight noise implicitly fits a variational posterior to maximize the evidence lower bound (ELBO). This insight allows us to train full-covariance, fully factorized, or matrix-variate Gaussian variational posteriors using noi

التعلم الآلي التعلم الالي