Quantum Algorithm for Online Convex Optimization

102 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Jianhao He

تاريخ النشر 2020

مجال البحث فيزياء الهندسة المعلوماتية

والبحث باللغة English

تأليف Jianhao He - Feidiao Yang - Jialin Zhang

فيزياء الكم التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We explore whether quantum advantages can be found for the zeroth-order online convex optimization problem, which is also known as bandit convex optimization with multi-point feedback. In this setting, given access to zeroth-order oracles (that is, the loss function is accessed as a black box that returns the function value for any queried input), a player attempts to minimize a sequence of adversarially generated convex loss functions. This procedure can be described as a $T$ round iterative game between the player and the adversary. In this paper, we present quantum algorithms for the problem and show for the first time that potential quantum advantages are possible for problems of online convex optimization. Specifically, our contributions are as follows. (i) When the player is allowed to query zeroth-order oracles $O(1)$ times in each round as feedback, we give a quantum algorithm that achieves $O(sqrt{T})$ regret without additional dependence of the dimension $n$, which outperforms the already known optimal classical algorithm only achieving $O(sqrt{nT})$ regret. Note that the regret of our quantum algorithm has achieved the lower bound of classical first-order methods. (ii) We show that for strongly convex loss functions, the quantum algorithm can achieve $O(log T)$ regret with $O(1)$ queries as well, which means that the quantum algorithm can achieve the same regret bound as the classical algorithms in the full information setting.

قيم البحث

70 - Yan Zhang , Robert J. Ravier , Michael M. Zavlanos 2019

In this paper, we consider the problem of distributed online convex optimization, where a network of local agents aim to jointly optimize a convex function over a period of multiple time steps. The agents do not have any information about the future. Existing algorithms have established dynamic regret bounds that have explicit dependence on the number of time steps. In this work, we show that we can remove this dependence assuming that the local objective functions are strongly convex. More precisely, we propose a gradient tracking algorithm where agents jointly communicate and descend based on corrected gradient steps. We verify our theoretical results through numerical experiments.

التحسين والتحكم التعلم الآلي

Boosting for Online Convex Optimization

123 - Elad Hazan , Karan Singh 2021

We consider the decision-making framework of online convex optimization with a very large number of experts. This setting is ubiquitous in contextual and reinforcement learning problems, where the size of the policy class renders enumeration and sear ch within the policy class infeasible. Instead, we consider generalizing the methodology of online boosting. We define a weak learning algorithm as a mechanism that guarantees multiplicatively approximate regret against a base class of experts. In this access model, we give an efficient boosting algorithm that guarantees near-optimal regret against the convex hull of the base class. We consider both full and partial (a.k.a. bandit) information feedback models. We also give an analogous efficient boosting algorithm for the i.i.d. statistical setting. Our results simultaneously generalize online boosting and gradient boosting guarantees to contextual learning model, online convex optimization and bandit linear optimization settings.

التعلم الآلي التعلم الالي

Adaptive Bound Optimization for Online Convex Optimization

632 - H. Brendan McMahan , Matthew Streeter 2010

We introduce a new online convex optimization algorithm that adaptively chooses its regularization function based on the loss functions observed so far. This is in contrast to previous algorithms that use a fixed regularization function such as L2-sq uared, and modify it only via a single time-dependent parameter. Our algorithms regret bounds are worst-case optimal, and for certain realistic classes of loss functions they are much better than existing bounds. These bounds are problem-dependent, which means they can exploit the structure of the actual problem instance. Critically, however, our algorithm does not need to know this structure in advance. Rather, we prove competitive guarantees that show the algorithm provides a bound within a constant factor of the best possible bound (of a certain functional form) in hindsight.

التعلم الآلي

Robust Control Optimization for Quantum Approximate Optimization Algorithm

120 - Yulong Dong , Xiang Meng , Lin Lin 2019

Quantum variational algorithms have garnered significant interest recently, due to their feasibility of being implemented and tested on noisy intermediate scale quantum (NISQ) devices. We examine the robustness of the quantum approximate optimization algorithm (QAOA), which can be used to solve certain quantum control problems, state preparation problems, and combinatorial optimization problems. We demonstrate that the error of QAOA simulation can be significantly reduced by robust control optimization techniques, specifically, by sequential convex programming (SCP), to ensure error suppression in situations where the source of the error is known but not necessarily its magnitude. We show that robust optimization improves both the objective landscape of QAOA as well as overall circuit fidelity in the presence of coherent errors and errors in initial state preparation.

فيزياء الكم أنظمة وتحكم أنظمة وتحكم

Hierarchical Online Convex Optimization

110 - Juncheng Wang , Ben Liang , Min Dong 2021

We consider online convex optimization (OCO) over a heterogeneous network with communication delay, where multiple workers together with a master execute a sequence of decisions to minimize the accumulation of time-varying global costs. The local dat a may not be independent or identically distributed, and the global cost functions may not be locally separable. Due to communication delay, neither the master nor the workers have in-time information about the current global cost function. We propose a new algorithm, termed Hierarchical OCO (HiOCO), which takes full advantage of the network heterogeneity in information timeliness and computation capacity to enable multi-step gradient descent at both the workers and the master. We analyze the impacts of the unique hierarchical architecture, multi-slot delay, and gradient estimation error to derive upper bounds on the dynamic regret of HiOCO, which measures the gap of costs between HiOCO and an offline globally optimal performance benchmark.

نظرية المعلومات نظرية المعلومات