A survey of Monte Carlo methods for noisy and costly densities with application to reinforcement learning

130 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Fernando Llorente Fern\\'andez

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية الاحصاء الرياضي

والبحث باللغة English

تأليف F. Llorente - L. Martino - J. Read

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This survey gives an overview of Monte Carlo methodologies using surrogate models, for dealing with densities which are intractable, costly, and/or noisy. This type of problem can be found in numerous real-world scenarios, including stochastic optimization and reinforcement learning, where each evaluation of a density function may incur some computationally-expensive or even physical (real-world activity) cost, likely to give different results each time. The surrogate model does not incur this cost, but there are important trade-offs and considerations involved in the choice and design of such methodologies. We classify the different methodologies into three main classes and describe specific instances of algorithms under a unified notation. A modular scheme which encompasses the considered methods is also presented. A range of application scenarios is discussed, with special attention to the likelihood-free setting and reinforcement learning. Several numerical comparisons are also provided.

قيم البحث

596 - Florian Maire , Randal Douc , Jimmy Olsson 2013

In this paper, we study the asymptotic variance of sample path averages for inhomogeneous Markov chains that evolve alternatingly according to two different $pi$-reversible Markov transition kernels $P$ and $Q$. More specifically, our main result all ows us to compare directly the asymptotic variances of two inhomogeneous Markov chains associated with different kernels $P_i$ and $Q_i$, $iin{0,1}$, as soon as the kernels of each pair $(P_0,P_1)$ and $(Q_0,Q_1)$ can be ordered in the sense of lag-one autocovariance. As an important application, we use this result for comparing different data-augmentation-type Metropolis-Hastings algorithms. In particular, we compare some pseudo-marginal algorithms and propose a novel exact algorithm, referred to as the random refreshment algorithm, which is more efficient, in terms of asymptotic variance, than the Grouped Independence Metropolis-Hastings algorithm and has a computational complexity that does not exceed that of the Monte Carlo Within Metropolis algorithm.

المنهجية حساب

Introducing Monte Carlo Methods with R Solutions to Odd-Numbered Exercises

325 - Christian P. Robert , George Casella 2010

This is the solution manual to the odd-numbered exercises in our book Introducing Monte Carlo Methods with R, published by Springer Verlag on December 10, 2009, and made freely available to everyone.

المنهجية حساب

Active Reinforcement Learning with Monte-Carlo Tree Search

332 - Sebastian Schulze , Owain Evans 2018

Active Reinforcement Learning (ARL) is a twist on RL where the agent observes reward information only if it pays a cost. This subtle change makes exploration substantially more challenging. Powerful principles in RL like optimism, Thompson sampling, and random exploration do not help with ARL. We relate ARL in tabular environments to Bayes-Adaptive MDPs. We provide an ARL algorithm using Monte-Carlo Tree Search that is asymptotically Bayes optimal. Experimentally, this algorithm is near-optimal on small Bandit problems and MDPs. On larger MDPs it outperforms a Q-learner augmented with specialised heuristics for ARL. By analysing exploration behaviour in detail, we uncover obstacles to scaling up simulation-based algorithms for ARL.

التعلم الآلي التعلم الالي

Unit Testing for MCMC and other Monte Carlo Methods

102 - Axel Gandy , James Scott 2020

We propose approaches for testing implementations of Markov Chain Monte Carlo methods as well as of general Monte Carlo methods. Based on statistical hypothesis tests, these approaches can be used in a unit testing framework to, for example, check if individual steps in a Gibbs sampler or a reversible jump MCMC have the desired invariant distribution. Two exact tests for assessing whether a given Markov chain has a specified invariant distribution are discussed. These and other tests of Monte Carlo methods can be embedded into a sequential method that allows low expected effort if the simulation shows the desired behavior and high power if it does not. Moreover, the false rejection probability can be kept arbitrarily low. For general Monte Carlo methods, this allows testing, for example, if a sampler has a specified distribution or if a sampler produces samples with the desired mean. The methods have been implemented in the R-package MCUnit.

المنهجية حساب

Multimodal Bayesian Registration of Noisy Functions using Hamiltonian Monte Carlo

99 - J. Derek Tucker , Lyndsay Shand , 2020

Functional data registration is a necessary processing step for many applications. The observed data can be inherently noisy, often due to measurement error or natural process uncertainty, which most functional alignment methods cannot handle. A pair of functions can also have multiple optimal alignment solutions, which is not addressed in current literature. In this paper, a flexible Bayesian approach to functional alignment is presented, which appropriately accounts for noise in the data without any pre-smoothing required. Additionally, by running parallel MCMC chains, the method can account for multiple optimal alignments via the multi-modal posterior distribution of the warping functions. To most efficiently sample the warping functions, the approach relies on a modification of the standard Hamiltonian Monte Carlo to be well-defined on the infinite-dimensional Hilbert space. This flexible Bayesian alignment method is applied to both simulated data and real data sets to show its efficiency in handling noisy functions and successfully accounting for multiple optimal alignments in the posterior; characterizing the uncertainty surrounding the warping functions.

المنهجية حساب