Do you want to publish a course? Click here

Faster Rates for Policy Learning

61   0   0.0 ( 0 )
 Added by Antoine Chambaz
 Publication date 2017
and research's language is English




Ask ChatGPT about the research

This article improves the existing proven rates of regret decay in optimal policy estimation. We give a margin-free result showing that the regret decay for estimating a within-class optimal policy is second-order for empirical risk minimizers over Donsker classes, with regret decaying at a faster rate than the standard error of an efficient estimator of the value of an optimal policy. We also give a result from the classification literature that shows that faster regret decay is possible via plug-in estimation provided a margin condition holds. Four examples are considered. In these examples, the regret is expressed in terms of either the mean value or the median value; the number of possible actions is either two or finitely many; and the sampling scheme is either independent and identically distributed or sequential, where the latter represents a contextual bandit sampling scheme.

rate research

Read More

In functional linear regression, the slope ``parameter is a function. Therefore, in a nonparametric context, it is determined by an infinite number of unknowns. Its estimation involves solving an ill-posed problem and has points of contact with a range of methodologies, including statistical smoothing and deconvolution. The standard approach to estimating the slope function is based explicitly on functional principal components analysis and, consequently, on spectral decomposition in terms of eigenvalues and eigenfunctions. We discuss this approach in detail and show that in certain circumstances, optimal convergence rates are achieved by the PCA technique. An alternative approach based on quadratic regularisation is suggested and shown to have advantages from some points of view.
We show that diffusion processes can be exploited to study the posterior contraction rates of parameters in Bayesian models. By treating the posterior distribution as a stationary distribution of a stochastic differential equation (SDE), posterior convergence rates can be established via control of the moments of the corresponding SDE. Our results depend on the structure of the population log-likelihood function, obtained in the limit of an infinite sample sample size, and stochastic perturbation bounds between the population and sample log-likelihood functions. When the population log-likelihood is strongly concave, we establish posterior convergence of a $d$-dimensional parameter at the optimal rate $(d/n)^{1/ 2}$. In the weakly concave setting, we show that the convergence rate is determined by the unique solution of a non-linear equation that arises from the interplay between the degree of weak concavity and the stochastic perturbation bounds. We illustrate this general theory by deriving posterior convergence rates for three concrete examples: Bayesian logistic regression models, Bayesian single index models, and over-specified Bayesian mixture models.
We study minimax estimation of two-dimensional totally positive distributions. Such distributions pertain to pairs of strongly positively dependent random variables and appear frequently in statistics and probability. In particular, for distributions with $beta$-Holder smooth densities where $beta in (0, 2)$, we observe polynomially faster minimax rates of estimation when, additionally, the total positivity condition is imposed. Moreover, we demonstrate fast algorithms to compute the proposed estimators and corroborate the theoretical rates of estimation by simulation studies.
This paper introduces a new approach to the study of rates of convergence for posterior distributions. It is a natural extension of a recent approach to the study of Bayesian consistency. In particular, we improve on current rates of convergence for models including the mixture of Dirichlet process model and the random Bernstein polynomial model.
This paper deals with the estimation of a probability measure on the real line from data observed with an additive noise. We are interested in rates of convergence for the Wasserstein metric of order $pgeq 1$. The distribution of the errors is assumed to be known and to belong to a class of supersmooth or ordinary smooth distributions. We obtain in the univariate situation an improved upper bound in the ordinary smooth case and less restrictive conditions for the existing bound in the supersmooth one. In the ordinary smooth case, a lower bound is also provided, and numerical experiments illustrating the rates of convergence are presented.
comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا