ﻻ يوجد ملخص باللغة العربية
In this work, we study the problem of global optimization in univariate loss functions, where we analyze the regret of the popular lower bounding algorithms (e.g., Piyavskii-Shubert algorithm). For any given time $T$, instead of the widely available simple regret (which is the difference of the losses between the best estimation up to $T$ and the global optimizer), we study the cumulative regret up to that time. With a suitable lower bounding algorithm, we show that it is possible to achieve satisfactory cumulative regret bounds for different classes of functions. For Lipschitz continuous functions with the parameter $L$, we show that the cumulative regret is $O(Llog T)$. For Lipschitz smooth functions with the parameter $H$, we show that the cumulative regret is $O(H)$. We also analytically extend our results for a broader class of functions that covers both the Lipschitz continuous and smooth functions individually.
We analyse adversarial bandit convex optimisation with an adversary that is restricted to playing functions of the form $f_t(x) = g_t(langle x, thetarangle)$ for convex $g_t : mathbb R to mathbb R$ and unknown $theta in mathbb R^d$ that is homogeneou
In this paper, the problem of safe global maximization (it should not be confused with robust optimization) of expensive noisy black-box functions satisfying the Lipschitz condition is considered. The notion safe means that the objective function $f(
Reinforcement learning (RL) with linear function approximation has received increasing attention recently. However, existing work has focused on obtaining $sqrt{T}$-type regret bound, where $T$ is the number of interactions with the MDP. In this pape
We study the model-based undiscounted reinforcement learning for partially observable Markov decision processes (POMDPs). The oracle we consider is the optimal policy of the POMDP with a known environment in terms of the average reward over an infini
We introduce a new regularization technique, using what we refer to as the Steklov regularization function, and apply this technique to devise an algorithm that computes a global minimizer of univariate coercive functions. First, we show that the Ste