ﻻ يوجد ملخص باللغة العربية
We study the problem of controlling a linear dynamical system with adversarial perturbations where the only feedback available to the controller is the scalar loss, and the loss function itself is unknown. For this problem, with either a known or unknown system, we give an efficient sublinear regret algorithm. The main algorithmic difficulty is the dependence of the loss on past controls. To overcome this issue, we propose an efficient algorithm for the general setting of bandit convex optimization for loss functions with memory, which may be of independent interest.
We study the problem of stochastic combinatorial pure exploration (CPE), where an agent sequentially pulls a set of single arms (a.k.a. a super arm) and tries to find the best super arm. Among a variety of problem settings of the CPE, we focus on the
Motivated by the high-frequency data streams continuously generated, real-time learning is becoming increasingly important. These data streams should be processed sequentially with the property that the stream may change over time. In this streaming
We design an algorithm which finds an $epsilon$-approximate stationary point (with $| abla F(x)|le epsilon$) using $O(epsilon^{-3})$ stochastic gradient and Hessian-vector products, matching guarantees that were previously available only under a stro
Several learning problems involve solving min-max problems, e.g., empirical distributional robust learning or learning with non-standard aggregated losses. More specifically, these problems are convex-linear problems where the minimization is carried
We formulate a new problem at the intersectionof semi-supervised learning and contextual bandits,motivated by several applications including clini-cal trials and ad recommendations. We demonstratehow Graph Convolutional Network (GCN), a semi-supervis