ﻻ يوجد ملخص باللغة العربية
Models trained with offline data often suffer from continual distribution shifts and expensive labeling in changing environments. This calls for a new online learning paradigm where the learner can continually adapt to changing environments with limited labels. In this paper, we propose a new online setting -- Online Active Continual Adaptation, where the learner aims to continually adapt to changing distributions using both unlabeled samples and active queries of limited labels. To this end, we propose Online Self-Adaptive Mirror Descent (OSAMD), which adopts an online teacher-student structure to enable online self-training from unlabeled data, and a margin-based criterion that decides whether to query the labels to track changing distributions. Theoretically, we show that, in the separable case, OSAMD has an $O({T}^{1/2})$ dynamic regret bound under mild assumptions, which is even tighter than the lower bound $Omega(T^{2/3})$ of traditional online learning with full labels. In the general case, we show a regret bound of $O({alpha^*}^{1/3} {T}^{2/3} + alpha^* T)$, where $alpha^*$ denotes the separability of domains and is usually small. Our theoretical results show that OSAMD can fast adapt to changing environments with active queries. Empirically, we demonstrate that OSAMD achieves favorable regrets under changing environments with limited labels on both simulated and real-world data, which corroborates our theoretical findings.
Real-world tasks often exhibit a compositional structure that contains a sequence of simpler sub-tasks. For instance, opening a door requires reaching, grasping, rotating, and pulling the door knob. Such compositional tasks require an agent to reason
Data collection and annotation are time-consuming in machine learning, expecially for large scale problem. A common approach for this problem is to transfer knowledge from a related labeled domain to a target one. There are two popular ways to achiev
Mainstream approaches for unsupervised domain adaptation (UDA) learn domain-invariant representations to bridge domain gap. More recently, self-training has been gaining momentum in UDA. Originated from semi-supervised learning, self-training uses un
Intelligent agents must pursue their goals in complex environments with partial information and often limited computational capacity. Reinforcement learning methods have achieved great success by creating agents that optimize engineered reward functi
Training a neural network model can be a lifelong learning process and is a computationally intensive one. A severe adverse effect that may occur in deep neural network models is that they can suffer from catastrophic forgetting during retraining on