Guided evolutionary strategies: Augmenting random search with surrogate gradients

52 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Niru Maheswaranathan

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Niru Maheswaranathan - Luke Metz - George Tucker

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Many applications in machine learning require optimizing a function whose true gradient is unknown, but where surrogate gradient information (directions that may be correlated with, but not necessarily identical to, the true gradient) is available instead. This arises when an approximate gradient is easier to compute than the full gradient (e.g. in meta-learning or unrolled optimization), or when a true gradient is intractable and is replaced with a surrogate (e.g. in certain reinforcement learning applications, or when using synthetic gradients). We propose Guided Evolutionary Strategies, a method for optimally using surrogate gradient directions along with random search. We define a search distribution for evolutionary strategies that is elongated along a guiding subspace spanned by the surrogate gradients. This allows us to estimate a descent direction which can then be passed to a first-order optimizer. We analytically and numerically characterize the tradeoffs that result from tuning how strongly the search distribution is stretched along the guiding subspace, and we use this to derive a setting of the hyperparameters that works well across problems. Finally, we apply our method to example problems, demonstrating an improvement over both standard evolutionary strategies and first-order methods (that directly follow the surrogate gradient). We provide a demo of Guided ES at https://github.com/brain-research/guided-evolutionary-strategies

قيم البحث

88 - Benjamin Cramer , Sebastian Billaudelle , Simeon Kanya 2020

To rapidly process temporal information at a low metabolic cost, biological neurons integrate inputs as an analog sum but communicate with spikes, binary events in time. Analog neuromorphic hardware uses the same principles to emulate spiking neural networks with exceptional energy-efficiency. However, instantiating high-performing spiking networks on such hardware remains a significant challenge due to device mismatch and the lack of efficient training algorithms. Here, we introduce a general in-the-loop learning framework based on surrogate gradients that resolves these issues. Using the BrainScaleS-2 neuromorphic system, we show that learning self-corrects for device mismatch resulting in competitive spiking network performance on both vision and speech benchmarks. Our networks display sparse spiking activity with, on average, far less than one spike per hidden neuron and input, perform inference at rates of up to 85 k frames/second, and consume less than 200 mW. In summary, our work sets several new benchmarks for low-energy spiking network processing on analog neuromorphic hardware and paves the way for future on-chip learning algorithms.

الحوسبة العصبية والتطورية التقنيات الناشئة التعلم الآلي

AdaSwarm: Augmenting Gradient-Based optimizers in Deep Learning with Swarm Intelligence

97 - Rohan Mohapatra , Snehanshu Saha , Carlos A. Coello Coello 2020

This paper introduces AdaSwarm, a novel gradient-free optimizer which has similar or even better performance than the Adam optimizer adopted in neural networks. In order to support our proposed AdaSwarm, a novel Exponentially weighted Momentum Partic le Swarm Optimizer (EMPSO), is proposed. The ability of AdaSwarm to tackle optimization problems is attributed to its capability to perform good gradient approximations. We show that, the gradient of any function, differentiable or not, can be approximated by using the parameters of EMPSO. This is a novel technique to simulate GD which lies at the boundary between numerical methods and swarm intelligence. Mathematical proofs of the gradient approximation produced are also provided. AdaSwarm competes closely with several state-of-the-art (SOTA) optimizers. We also show that AdaSwarm is able to handle a variety of loss functions during backpropagation, including the maximum absolute error (MAE).

الحوسبة العصبية والتطورية التعلم الآلي التعلم الالي

DarwinML: A Graph-based Evolutionary Algorithm for Automated Machine Learning

171 - Fei Qi , Zhaohui Xia , Gaoyang Tang 2018

As an emerging field, Automated Machine Learning (AutoML) aims to reduce or eliminate manual operations that require expertise in machine learning. In this paper, a graph-based architecture is employed to represent flexible combinations of ML models, which provides a large searching space compared to tree-based and stacking-based architectures. Based on this, an evolutionary algorithm is proposed to search for the best architecture, where the mutation and heredity operators are the key for architecture evolution. With Bayesian hyper-parameter optimization, the proposed approach can automate the workflow of machine learning. On the PMLB dataset, the proposed approach shows the state-of-the-art performance compared with TPOT, Autostacker, and auto-sklearn. Some of the optimized models are with complex structures which are difficult to obtain in manual design.

الحوسبة العصبية والتطورية التعلم الآلي التعلم الالي

Transfer Learning Based Co-surrogate Assisted Evolutionary Bi-objective Optimization for Objectives with Non-uniform Evaluation Times

77 - Xilu Wang , Yaochu Jin , Sebastian Schmitt 2021

Most existing multiobjetive evolutionary algorithms (MOEAs) implicitly assume that each objective function can be evaluated within the same period of time. Typically. this is untenable in many real-world optimization scenarios where evaluation of dif ferent objectives involves different computer simulations or physical experiments with distinct time complexity. To address this issue, a transfer learning scheme based on surrogate-assisted evolutionary algorithms (SAEAs) is proposed, in which a co-surrogate is adopted to model the functional relationship between the fast and slow objective functions and a transferable instance selection method is introduced to acquire useful knowledge from the search process of the fast objective. Our experimental results on DTLZ and UF test suites demonstrate that the proposed algorithm is competitive for solving bi-objective optimization where objectives have non-uniform evaluation times.

الحوسبة العصبية والتطورية

A Survey on Evolutionary Neural Architecture Search

90 - Yuqiao Liu , Yanan Sun , Bing Xue 2020

Deep Neural Networks (DNNs) have achieved great success in many applications. The architectures of DNNs play a crucial role in their performance, which is usually manually designed with rich expertise. However, such a design process is labour intensi ve because of the trial-and-error process, and also not easy to realize due to the rare expertise in practice. Neural Architecture Search (NAS) is a type of technology that can design the architectures automatically. Among different methods to realize NAS, Evolutionary Computation (EC) methods have recently gained much attention and success. Unfortunately, there has not yet been a comprehensive summary of the EC-based NAS algorithms. This paper reviews over 200 papers of most recent EC-based NAS methods in light of the core components, to systematically discuss their design principles as well as justifications on the design. Furthermore, current challenges and issues are also discussed to identify future research in this emerging field.

الحوسبة العصبية والتطورية