Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Adaptive Importance Sampling for Finite-Sum Optimization and Sampling with Decreasing Step-Sizes

125 0 0.0 ( 0 )

Download Cite

Added by Ayoub El Hanchi

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Ayoub El Hanchi - David A. Stephens

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Reducing the variance of the gradient estimator is known to improve the convergence rate of stochastic gradient-based optimization and sampling algorithms. One way of achieving variance reduction is to design importance sampling strategies. Recently, the problem of designing such schemes was formulated as an online learning problem with bandit feedback, and algorithms with sub-linear static regret were designed. In this work, we build on this framework and propose Avare, a simple and efficient algorithm for adaptive importance sampling for finite-sum optimization and sampling with decreasing step-sizes. Under standard technical conditions, we show that Avare achieves $mathcal{O}(T^{2/3})$ and $mathcal{O}(T^{5/6})$ dynamic regret for SGD and SGLD respectively when run with $mathcal{O}(1/t)$ step sizes. We achieve this dynamic regret bound by leveraging our knowledge of the dynamics defined by the algorithm, and combining ideas from online learning and variance-reduced stochastic optimization. We validate empirically the performance of our algorithm and identify settings in which it leads to significant improvements.

rate research

Layered Adaptive Importance Sampling

497 - L. Martino , V. Elvira , D. Luengo 2015

Monte Carlo methods represent the de facto standard for approximating complicated integrals involving multidimensional target distributions. In order to generate random realizations from the target distribution, Monte Carlo techniques use simpler proposal probability densities to draw candidate samples. The performance of any such method is strictly related to the specification of the proposal distribution, such that unfortunate choices easily wreak havoc on the resulting estimators. In this work, we introduce a layered (i.e., hierarchical) procedure to generate samples employed within a Monte Carlo scheme. This approach ensures that an appropriate equivalent proposal density is always obtained automatically (thus eliminating the risk of a catastrophic performance), although at the expense of a moderate increase in the complexity. Furthermore, we provide a general unified importance sampling (IS) framework, where multiple proposal densities are employed and several IS schemes are introduced by applying the so-called deterministic mixture approach. Finally, given these schemes, we also propose a novel class of adaptive importance samplers using a population of proposals, where the adaptation is driven by independent parallel or interacting Markov Chain Monte Carlo (MCMC) chains. The resulting algorithms efficiently combine the benefits of both IS and MCMC methods.

Computation Machine Learning Machine Learning

Bandwidth-based Step-Sizes for Non-Convex Stochastic Optimization

86 - Xiaoyu Wang , Mikael Johansson 2021

Many popular learning-rate schedules for deep neural networks combine a decaying trend with local perturbations that attempt to escape saddle points and bad local minima. We derive convergence guarantees for bandwidth-based step-sizes, a general class of learning-rates that are allowed to vary in a banded region. This framework includes cyclic and non-monotonic step-sizes for which no theoretical guarantees were previously known. We provide worst-case guarantees for SGD on smooth non-convex problems under several bandwidth-based step sizes, including stagewise $1/sqrt{t}$ and the popular step-decay (constant and then drop by a constant), which is also shown to be optimal. Moreover, we show that its momentum variant (SGDM) converges as fast as SGD with the bandwidth-based step-decay step-size. Finally, we propose some novel step-size schemes in the bandwidth-based family and verify their efficiency on several deep neural network training tasks.

Machine Learning Optimization and Control

Faster Coordinate Descent via Adaptive Importance Sampling

157 - Dmytro Perekrestenko , Volkan Cevher , Martin Jaggi 2017

Coordinate descent methods employ random partial updates of decision variables in order to solve huge-scale convex optimization problems. In this work, we introduce new adaptive rules for the random selection of their updates. By adaptive, we mean that our selection rules are based on the dual residual or the primal-dual gap estimates and can change at each iteration. We theoretically characterize the performance of our selection rules and demonstrate improvements over the state-of-the-art, and extend our theory and algorithms to general convex objectives. Numerical evidence with hinge-loss support vector machines and Lasso confirm that the practice follows the theory.

Machine Learning Computer Vision and Pattern Recognition Optimization and Control

Adaptive Multiple Importance Sampling

397 - Jean-Marie Cornuet 2009

The Adaptive Multiple Importance Sampling (AMIS) algorithm is aimed at an optimal recycling of past simulations in an iterated importance sampling scheme. The difference with earlier adaptive importance sampling implementations like Population Monte Carlo is that the importance weights of all simulated values, past as well as present, are recomputed at each iteration, following the technique of the deterministic multiple mixture estimator of Owen and Zhou (2000). Although the convergence properties of the algorithm cannot be fully investigated, we demonstrate through a challenging banana shape target distribution and a population genetics example that the improvement brought by this technique is substantial.

Computation Applications

Symbolic Parallel Adaptive Importance Sampling for Probabilistic Program Analysis

183 - Yicheng Luo , Antonio Filieri , Yuan Zhou 2020

Probabilistic software analysis aims at quantifying the probability of a target event occurring during the execution of a program processing uncertain incoming data or written itself using probabilistic programming constructs. Recent techniques combine symbolic execution with model counting or solution space quantification methods to obtain accurate estimates of the occurrence probability of rare target events, such as failures in a mission-critical system. However, they face several scalability and applicability limitations when analyzing software processing with high-dimensional and correlated multivariate input distributions. In this paper, we present SYMbolic Parallel Adaptive Importance Sampling (SYMPAIS), a new inference method tailored to analyze path conditions generated from the symbolic execution of programs with high-dimensional, correlated input distributions. SYMPAIS combines results from importance sampling and constraint solving to produce accurate estimates of the satisfaction probability for a broad class of constraints that cannot be analyzed by current solution space quantification methods. We demonstrate SYMPAISs generality and performance compared with state-of-the-art alternatives on a set of problems from different application domains.

Machine Learning Artificial Intelligence Programming Languages

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Adaptive Importance Sampling for Finite-Sum Optimization and Sampling with Decreasing Step-Sizes

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions