Maximum Likelihood-based Online Adaptation of Hyper-parameters in CMA-ES

242 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Loshchilov Ilya

تاريخ النشر 2014

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Ilya Loshchilov

الحوسبة العصبية والتطورية الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The Covariance Matrix Adaptation Evolution Strategy (CMA-ES) is widely accepted as a robust derivative-free continuous optimization algorithm for non-linear and non-convex optimization problems. CMA-ES is well known to be almost parameterless, meaning that only one hyper-parameter, the population size, is proposed to be tuned by the user. In this paper, we propose a principled approach called self-CMA-ES to achieve the online adaptation of CMA-ES hyper-parameters in order to improve its overall performance. Experimental results show that for larger-than-default population size, the default settings of hyper-parameters of CMA-ES are far from being optimal, and that self-CMA-ES allows for dynamically approaching optimal settings.

قيم البحث

289 - Ilya Loshchilov 2014

We propose a computationally efficient limited memory Covariance Matrix Adaptation Evolution Strategy for large scale optimization, which we call the LM-CMA-ES. The LM-CMA-ES is a stochastic, derivative-free algorithm for numerical optimization of no n-linear, non-convex optimization problems in continuous domain. Inspired by the limited memory BFGS method of Liu and Nocedal (1989), the LM-CMA-ES samples candidate solutions according to a covariance matrix reproduced from $m$ direction vectors selected during the optimization process. The decomposition of the covariance matrix into Cholesky factors allows to reduce the time and memory complexity of the sampling to $O(mn)$, where $n$ is the number of decision variables. When $n$ is large (e.g., $n$ > 1000), even relatively small values of $m$ (e.g., $m=20,30$) are sufficient to efficiently solve fully non-separable problems and to reduce the overall run-time.

الحوسبة العصبية والتطورية

Genetic Algorithm based hyper-parameters optimization for transfer Convolutional Neural Network

373 - Chen Li , JinZhe Jiang , YaQian Zhao 2021

Hyperparameter optimization is a challenging problem in developing deep neural networks. Decision of transfer layers and trainable layers is a major task for design of the transfer convolutional neural networks (CNN). Conventional transfer CNN models are usually manually designed based on intuition. In this paper, a genetic algorithm is applied to select trainable layers of the transfer model. The filter criterion is constructed by accuracy and the counts of the trainable layers. The results show that the method is competent in this task. The system will converge with a precision of 97% in the classification of Cats and Dogs datasets, in no more than 15 generations. Moreover, backward inference according the results of the genetic algorithm shows that our method can capture the gradient features in network layers, which plays a part on understanding of the transfer AI models.

الحوسبة العصبية والتطورية التعلم الآلي

Alternative Restart Strategies for CMA-ES

314 - Ilya Loshchilov 2012

This paper focuses on the restart strategy of CMA-ES on multi-modal functions. A first alternative strategy proceeds by decreasing the initial step-size of the mutation while doubling the population size at each restart. A second strategy adaptively allocates the computational budget among the restart settings in the BIPOP scheme. Both restart strategies are validated on the BBOB benchmark; their generality is also demonstrated on an independent real-world problem suite related to spacecraft trajectory optimization.

الذكاء الاصطناعي

Maximum likelihood estimation for disk image parameters

121 - Matwey V. Kornilov 2019

We present a novel technique for estimating disk parameters (the centre and the radius) from its 2D image. It is based on the maximal likelihood approach utilising both edge pixels coordinates and the image intensity gradients. We emphasise the follo wing advantages of our likelihood model. It has closed-form formulae for parameter estimating, requiring less computational resources than iterative algorithms therefore. The likelihood model naturally distinguishes the outer and inner annulus edges. The proposed technique was evaluated on both synthetic and real data.

معالجة الصور والفيديو الأجهزة والأساليب للزيئات الفيزياء الفلكية الرؤية الحاسوبية وتمييز الأنماط

Maximum Likelihood Estimation for Learning Populations of Parameters

109 - Ramya Korlakai Vinayak , Weihao Kong , Gregory Valiant 2019

Consider a setting with $N$ independent individuals, each with an unknown parameter, $p_i in [0, 1]$ drawn from some unknown distribution $P^star$. After observing the outcomes of $t$ independent Bernoulli trials, i.e., $X_i sim text{Binomial}(t, p_i )$ per individual, our objective is to accurately estimate $P^star$. This problem arises in numerous domains, including the social sciences, psychology, health-care, and biology, where the size of the population under study is usually large while the number of observations per individual is often limited. Our main result shows that, in the regime where $t ll N$, the maximum likelihood estimator (MLE) is both statistically minimax optimal and efficiently computable. Precisely, for sufficiently large $N$, the MLE achieves the information theoretic optimal error bound of $mathcal{O}(frac{1}{t})$ for $t < clog{N}$, with regards to the earth movers distance (between the estimated and true distributions). More generally, in an exponentially large interval of $t$ beyond $c log{N}$, the MLE achieves the minimax error bound of $mathcal{O}(frac{1}{sqrt{tlog N}})$. In contrast, regardless of how large $N$ is, the naive plug-in estimator for this problem only achieves the sub-optimal error of $Theta(frac{1}{sqrt{t}})$.

نظرية الإحصاء التعلم الآلي التعلم الالي