Do you want to publish a course? Click here

Maximum Likelihood-based Online Adaptation of Hyper-parameters in CMA-ES

92   0   0.0 ( 0 )
 Added by Loshchilov Ilya
 Publication date 2014
and research's language is English




Ask ChatGPT about the research

The Covariance Matrix Adaptation Evolution Strategy (CMA-ES) is widely accepted as a robust derivative-free continuous optimization algorithm for non-linear and non-convex optimization problems. CMA-ES is well known to be almost parameterless, meaning that only one hyper-parameter, the population size, is proposed to be tuned by the user. In this paper, we propose a principled approach called self-CMA-ES to achieve the online adaptation of CMA-ES hyper-parameters in order to improve its overall performance. Experimental results show that for larger-than-default population size, the default settings of hyper-parameters of CMA-ES are far from being optimal, and that self-CMA-ES allows for dynamically approaching optimal settings.



rate research

Read More

91 - Ilya Loshchilov 2014
We propose a computationally efficient limited memory Covariance Matrix Adaptation Evolution Strategy for large scale optimization, which we call the LM-CMA-ES. The LM-CMA-ES is a stochastic, derivative-free algorithm for numerical optimization of non-linear, non-convex optimization problems in continuous domain. Inspired by the limited memory BFGS method of Liu and Nocedal (1989), the LM-CMA-ES samples candidate solutions according to a covariance matrix reproduced from $m$ direction vectors selected during the optimization process. The decomposition of the covariance matrix into Cholesky factors allows to reduce the time and memory complexity of the sampling to $O(mn)$, where $n$ is the number of decision variables. When $n$ is large (e.g., $n$ > 1000), even relatively small values of $m$ (e.g., $m=20,30$) are sufficient to efficiently solve fully non-separable problems and to reduce the overall run-time.
Hyperparameter optimization is a challenging problem in developing deep neural networks. Decision of transfer layers and trainable layers is a major task for design of the transfer convolutional neural networks (CNN). Conventional transfer CNN models are usually manually designed based on intuition. In this paper, a genetic algorithm is applied to select trainable layers of the transfer model. The filter criterion is constructed by accuracy and the counts of the trainable layers. The results show that the method is competent in this task. The system will converge with a precision of 97% in the classification of Cats and Dogs datasets, in no more than 15 generations. Moreover, backward inference according the results of the genetic algorithm shows that our method can capture the gradient features in network layers, which plays a part on understanding of the transfer AI models.
167 - Ilya Loshchilov 2012
This paper focuses on the restart strategy of CMA-ES on multi-modal functions. A first alternative strategy proceeds by decreasing the initial step-size of the mutation while doubling the population size at each restart. A second strategy adaptively allocates the computational budget among the restart settings in the BIPOP scheme. Both restart strategies are validated on the BBOB benchmark; their generality is also demonstrated on an independent real-world problem suite related to spacecraft trajectory optimization.
121 - Matwey V. Kornilov 2019
We present a novel technique for estimating disk parameters (the centre and the radius) from its 2D image. It is based on the maximal likelihood approach utilising both edge pixels coordinates and the image intensity gradients. We emphasise the following advantages of our likelihood model. It has closed-form formulae for parameter estimating, requiring less computational resources than iterative algorithms therefore. The likelihood model naturally distinguishes the outer and inner annulus edges. The proposed technique was evaluated on both synthetic and real data.
Consider a setting with $N$ independent individuals, each with an unknown parameter, $p_i in [0, 1]$ drawn from some unknown distribution $P^star$. After observing the outcomes of $t$ independent Bernoulli trials, i.e., $X_i sim text{Binomial}(t, p_i)$ per individual, our objective is to accurately estimate $P^star$. This problem arises in numerous domains, including the social sciences, psychology, health-care, and biology, where the size of the population under study is usually large while the number of observations per individual is often limited. Our main result shows that, in the regime where $t ll N$, the maximum likelihood estimator (MLE) is both statistically minimax optimal and efficiently computable. Precisely, for sufficiently large $N$, the MLE achieves the information theoretic optimal error bound of $mathcal{O}(frac{1}{t})$ for $t < clog{N}$, with regards to the earth movers distance (between the estimated and true distributions). More generally, in an exponentially large interval of $t$ beyond $c log{N}$, the MLE achieves the minimax error bound of $mathcal{O}(frac{1}{sqrt{tlog N}})$. In contrast, regardless of how large $N$ is, the naive plug-in estimator for this problem only achieves the sub-optimal error of $Theta(frac{1}{sqrt{t}})$.

suggested questions

comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا