Unsupervised Continual Learning via Self-Adaptive Deep Clustering Approach

82 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Mahardhika Pratama Dr

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Mahardhika Pratama - Andri Ashfahani - Edwin Lughofer

التعلم الآلي الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Unsupervised continual learning remains a relatively uncharted territory in the existing literature because the vast majority of existing works call for unlimited access of ground truth incurring expensive labelling cost. Another issue lies in the problem of task boundaries and task IDs which must be known for models updates or models predictions hindering feasibility for real-time deployment. Knowledge Retention in Self-Adaptive Deep Continual Learner, (KIERA), is proposed in this paper. KIERA is developed from the notion of flexible deep clustering approach possessing an elastic network structure to cope with changing environments in the timely manner. The centroid-based experience replay is put forward to overcome the catastrophic forgetting problem. KIERA does not exploit any labelled samples for model updates while featuring a task-agnostic merit. The advantage of KIERA has been numerically validated in popular continual learning problems where it shows highly competitive performance compared to state-of-the art approaches. Our implementation is available in textit{url{https://github.com/ContinualAL/KIERA}}.

قيم البحث

90 - Dushyant Rao , Francesco Visin , Andrei A. Rusu 2019

Continual learning aims to improve the ability of modern learning systems to deal with non-stationary distributions, typically by attempting to learn a series of tasks sequentially. Prior art in the field has largely considered supervised or reinforc ement learning tasks, and often assumes full knowledge of task labels and boundaries. In this work, we propose an approach (CURL) to tackle a more general problem that we will refer to as unsupervised continual learning. The focus is on learning representations without any knowledge about task identity, and we explore scenarios when there are abrupt changes between tasks, smooth transitions from one task to another, or even when the data is shuffled. The proposed approach performs task inference directly within the model, is able to dynamically expand to capture new concepts over its lifetime, and incorporates additional rehearsal-based techniques to deal with catastrophic forgetting. We demonstrate the efficacy of CURL in an unsupervised learning setting with MNIST and Omniglot, where the lack of labels ensures no information is leaked about the task. Further, we demonstrate strong performance compared to prior art in an i.i.d setting, or when adapting the technique to supervised tasks such as incremental class learning.

التعلم الآلي الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط

Efficient Deep Reinforcement Learning via Adaptive Policy Transfer

246 - Tianpei Yang , Jianye Hao , Zhaopeng Meng 2020

Transfer Learning (TL) has shown great potential to accelerate Reinforcement Learning (RL) by leveraging prior knowledge from past learned policies of relevant tasks. Existing transfer approaches either explicitly computes the similarity between task s or select appropriate source policies to provide guided explorations for the target task. However, how to directly optimize the target policy by alternatively utilizing knowledge from appropriate source policies without explicitly measuring the similarity is currently missing. In this paper, we propose a novel Policy Transfer Framework (PTF) to accelerate RL by taking advantage of this idea. Our framework learns when and which source policy is the best to reuse for the target policy and when to terminate it by modeling multi-policy transfer as the option learning problem. PTF can be easily combined with existing deep RL approaches. Experimental results show it significantly accelerates the learning process and surpasses state-of-the-art policy transfer methods in terms of learning efficiency and final performance in both discrete and continuous action spaces.

التعلم الآلي الذكاء الاصطناعي التعلم الالي

Unsupervised Continual Learning Via Pseudo Labels

94 - Jiangpeng He , Fengqing Zhu 2021

Continual learning aims to learn new tasks incrementally using less computation and memory resources instead of retraining the model from scratch whenever new task arrives. However, existing approaches are designed in supervised fashion assuming all data from new tasks have been manually annotated, which are not practical for many real-life applications. In this work, we propose to use pseudo label instead of the ground truth to make continual learning feasible in unsupervised mode. The pseudo labels of new data are obtained by applying global clustering algorithm and we propose to use the model updated from last incremental step as the feature extractor. Due to the scarcity of existing work, we introduce a new benchmark experimental protocol for unsupervised continual learning of image classification task under class-incremental setting where no class label is provided for each incremental learning step. Our method is evaluated on the CIFAR-100 and ImageNet (ILSVRC) datasets by incorporating the pseudo label with various existing supervised approaches and show promising results in unsupervised scenario.

الرؤية الحاسوبية وتمييز الأنماط

Continual Learning of Control Primitives: Skill Discovery via Reset-Games

300 - Kelvin Xu , Siddharth Verma , Chelsea Finn 2020

Reinforcement learning has the potential to automate the acquisition of behavior in complex settings, but in order for it to be successfully deployed, a number of practical challenges must be addressed. First, in real world settings, when an agent at tempts a task and fails, the environment must somehow reset so that the agent can attempt the task again. While easy in simulation, this could require considerable human effort in the real world, especially if the number of trials is very large. Second, real world learning often involves complex, temporally extended behavior that is often difficult to acquire with random exploration. While these two problems may at first appear unrelated, in this work, we show how a single method can allow an agent to acquire skills with minimal supervision while removing the need for resets. We do this by exploiting the insight that the need to reset an agent to a broad set of initial states for a learning task provides a natural setting to learn a diverse set of reset-skills. We propose a general-sum game formulation that balances the objectives of resetting and learning skills, and demonstrate that this approach improves performance on reset-free tasks, and additionally show that the skills we obtain can be used to significantly accelerate downstream learning.

التعلم الآلي الذكاء الاصطناعي

A Critique of Self-Expressive Deep Subspace Clustering

133 - Benjamin D. Haeffele , Chong You , Rene Vidal 2020

Subspace clustering is an unsupervised clustering technique designed to cluster data that is supported on a union of linear subspaces, with each subspace defining a cluster with dimension lower than the ambient space. Many existing formulations for t his problem are based on exploiting the self-expressive property of linear subspaces, where any point within a subspace can be represented as linear combination of other points within the subspace. To extend this approach to data supported on a union of non-linear manifolds, numerous studies have proposed learning an embedding of the original data using a neural network which is regularized by a self-expressive loss function on the data in the embedded space to encourage a union of linear subspaces prior on the data in the embedded space. Here we show that there are a number of potential flaws with this approach which have not been adequately addressed in prior work. In particular, we show the model formulation is often ill-posed in that it can lead to a degenerate embedding of the data, which need not correspond to a union of subspaces at all and is poorly suited for clustering. We validate our theoretical results experimentally and also repeat prior experiments reported in the literature, where we conclude that a significant portion of the previously claimed performance benefits can be attributed to an ad-hoc post processing step rather than the deep subspace clustering model.

التعلم الآلي الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط