Adaptive Bound Optimization for Online Convex Optimization

556 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Hugh Brendan McMahan

تاريخ النشر 2010

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف H. Brendan McMahan - Matthew Streeter

التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We introduce a new online convex optimization algorithm that adaptively chooses its regularization function based on the loss functions observed so far. This is in contrast to previous algorithms that use a fixed regularization function such as L2-squared, and modify it only via a single time-dependent parameter. Our algorithms regret bounds are worst-case optimal, and for certain realistic classes of loss functions they are much better than existing bounds. These bounds are problem-dependent, which means they can exploit the structure of the actual problem instance. Critically, however, our algorithm does not need to know this structure in advance. Rather, we prove competitive guarantees that show the algorithm provides a bound within a constant factor of the best possible bound (of a certain functional form) in hindsight.

قيم البحث

اقرأ أيضاً

Boosting for Online Convex Optimization

123 - Elad Hazan , Karan Singh 2021

We consider the decision-making framework of online convex optimization with a very large number of experts. This setting is ubiquitous in contextual and reinforcement learning problems, where the size of the policy class renders enumeration and sear ch within the policy class infeasible. Instead, we consider generalizing the methodology of online boosting. We define a weak learning algorithm as a mechanism that guarantees multiplicatively approximate regret against a base class of experts. In this access model, we give an efficient boosting algorithm that guarantees near-optimal regret against the convex hull of the base class. We consider both full and partial (a.k.a. bandit) information feedback models. We also give an analogous efficient boosting algorithm for the i.i.d. statistical setting. Our results simultaneously generalize online boosting and gradient boosting guarantees to contextual learning model, online convex optimization and bandit linear optimization settings.

التعلم الآلي التعلم الالي

No-Regret Algorithms for Unconstrained Online Convex Optimization

336 - Matthew Streeter , H. Brendan McMahan 2012

Some of the most compelling applications of online convex optimization, including online prediction and classification, are unconstrained: the natural feasible set is R^n. Existing algorithms fail to achieve sub-linear regret in this setting unless c onstraints on the comparator point x^* are known in advance. We present algorithms that, without such prior knowledge, offer near-optimal regret bounds with respect to any choice of x^*. In particular, regret with respect to x^* = 0 is constant. We then prove lower bounds showing that our guarantees are near-optimal in this setting.

التعلم الآلي

Adversarial Delays in Online Strongly-Convex Optimization

188 - Daniel Khashabi , Kent Quanrud , Amirhossein Taghvaei 2016

We consider the problem of strongly-convex online optimization in presence of adversarial delays; in a T-iteration online game, the feedback of the players query at time t is arbitrarily delayed by an adversary for d_t rounds and delivered before the game ends, at iteration t+d_t-1. Specifically for algo{online-gradient-descent} algorithm we show it has a simple regret bound of Oh{sum_{t=1}^T log (1+ frac{d_t}{t})}. This gives a clear and simple bound without resorting any distributional and limiting assumptions on the delays. We further show how this result encompasses and generalizes several of the existing known results in the literature. Specifically it matches the celebrated logarithmic regret Oh{log T} when there are no delays (i.e. d_t = 1) and regret bound of Oh{tau log T} for constant delays d_t = tau.

التعلم الآلي الذكاء الاصطناعي التعلم الالي

A Simple yet Universal Strategy for Online Convex Optimization

129 - Lijun Zhang , Guanghui Wang , Jinfeng Yi 2021

Recently, several universal methods have been proposed for online convex optimization, and attain minimax rates for multiple types of convex functions simultaneously. However, they need to design and optimize one surrogate loss for each type of funct ions, which makes it difficult to exploit the structure of the problem and utilize the vast amount of existing algorithms. In this paper, we propose a simple strategy for universal online convex optimization, which avoids these limitations. The key idea is to construct a set of experts to process the original online functions, and deploy a meta-algorithm over the emph{linearized} losses to aggregate predictions from experts. Specifically, we choose Adapt-ML-Prod to track the best expert, because it has a second-order bound and can be used to leverage strong convexity and exponential concavity. In this way, we can plug in off-the-shelf online solvers as black-box experts to deliver problem-dependent regret bounds. Furthermore, our strategy inherits the theoretical guarantee of any expert designed for strongly convex functions and exponentially concave functions, up to a double logarithmic factor. For general convex functions, it maintains the minimax optimality and also achieves a small-loss bound.

التعلم الآلي التحسين والتحكم

Hierarchical Online Convex Optimization

110 - Juncheng Wang , Ben Liang , Min Dong 2021

We consider online convex optimization (OCO) over a heterogeneous network with communication delay, where multiple workers together with a master execute a sequence of decisions to minimize the accumulation of time-varying global costs. The local dat a may not be independent or identically distributed, and the global cost functions may not be locally separable. Due to communication delay, neither the master nor the workers have in-time information about the current global cost function. We propose a new algorithm, termed Hierarchical OCO (HiOCO), which takes full advantage of the network heterogeneity in information timeliness and computation capacity to enable multi-step gradient descent at both the workers and the master. We analyze the impacts of the unique hierarchical architecture, multi-slot delay, and gradient estimation error to derive upper bounds on the dynamic regret of HiOCO, which measures the gap of costs between HiOCO and an offline globally optimal performance benchmark.

نظرية المعلومات نظرية المعلومات