ﻻ يوجد ملخص باللغة العربية
Online machine learning systems need to adapt to domain shifts. Meanwhile, acquiring label at every timestep is expensive. We propose a surprisingly simple algorithm that adaptively balances its regret and its number of label queries in settings where the data streams are from a mixture of hidden domains. For online linear regression with oblivious adversaries, we provide a tight tradeoff that depends on the durations and dimensionalities of the hidden domains. Our algorithm can adaptively deal with interleaving spans of inputs from different domains. We also generalize our results to non-linear regression for hypothesis classes with bounded eluder dimension and adaptive adversaries. Experiments on synthetic and realistic datasets demonstrate that our algorithm achieves lower regret than uniform queries and greedy queries with equal labeling budget.
Active learning is an important technique to reduce the number of labeled examples in supervised learning. Active learning for binary classification has been well addressed in machine learning. However, active learning of the reject option classifier
Transfer learning has been demonstrated to be successful and essential in diverse applications, which transfers knowledge from related but different source domains to the target domain. Online transfer learning(OTL) is a more challenging problem wher
In this paper, we investigate the impact of diverse user preference on learning under the stochastic multi-armed bandit (MAB) framework. We aim to show that when the user preferences are sufficiently diverse and each arm can be optimal for certain us
Deep learning models are considered to be state-of-the-art in many offline machine learning tasks. However, many of the techniques developed are not suitable for online learning tasks. The problem of using deep learning models with sequential data be
This paper deals with bandit online learning problems involving feedback of unknown delay that can emerge in multi-armed bandit (MAB) and bandit convex optimization (BCO) settings. MAB and BCO require only values of the objective function involved th