ترغب بنشر مسار تعليمي؟ اضغط هنا

Local Contrast Learning

103   0   0.0 ( 0 )
 نشر من قبل Chuanyun Xu Dr.
 تاريخ النشر 2018
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Learning a deep model from small data is yet an opening and challenging problem. We focus on one-shot classification by deep learning approach based on a small quantity of training samples. We proposed a novel deep learning approach named Local Contrast Learning (LCL) based on the key insight about a human cognitive behavior that human recognizes the objects in a specific context by contrasting the objects in the context or in her/his memory. LCL is used to train a deep model that can contrast the recognizing sample with a couple of contrastive samples randomly drawn and shuffled. On one-shot classification task on Omniglot, the deep model based LCL with 122 layers and 1.94 millions of parameters, which was trained on a tiny dataset with only 60 classes and 20 samples per class, achieved the accuracy 97.99% that outperforms human and state-of-the-art established by Bayesian Program Learning (BPL) trained on 964 classes. LCL is a fundamental idea which can be applied to alleviate parametric models overfitting resulted by lack of training samples.



قيم البحث

اقرأ أيضاً

67 - Qianjiang Hu , Xiao Wang , Wei Hu 2020
Contrastive learning relies on constructing a collection of negative examples that are sufficiently hard to discriminate against positive queries when their representations are self-trained. Existing contrastive learning methods either maintain a que ue of negative samples over minibatches while only a small portion of them are updated in an iteration, or only use the other examples from the current minibatch as negatives. They could not closely track the change of the learned representation over iterations by updating the entire queue as a whole, or discard the useful information from the past minibatches. Alternatively, we present to directly learn a set of negative adversaries playing against the self-trained representation. Two players, the representation network and negative adversaries, are alternately updated to obtain the most challenging negative examples against which the representation of positive queries will be trained to discriminate. We further show that the negative adversaries are updated towards a weighted combination of positive queries by maximizing the adversarial contrastive loss, thereby allowing them to closely track the change of representations over time. Experiment results demonstrate the proposed Adversarial Contrastive (AdCo) model not only achieves superior performances (a top-1 accuracy of 73.2% over 200 epochs and 75.7% over 800 epochs with linear evaluation on ImageNet), but also can be pre-trained more efficiently with fewer epochs.
We propose a new clustering algorithm, Extended Affinity Propagation, based on pairwise similarities. Extended Affinity Propagation is developed by modifying Affinity Propagation such that the desirable features of Affinity Propagation, e.g., exempla rs, reasonable computational complexity and no need to specify number of clusters, are preserved while the shortcomings, e.g., the lack of global structure discovery, that limit the applicability of Affinity Propagation are overcome. Extended Affinity Propagation succeeds not only in achieving this goal but can also provide various additional insights into the internal structure of the individual clusters, e.g., refined confidence values, relative cluster densities and local cluster strength in different regions of a cluster, which are valuable for an analyst. We briefly discuss how these insights can help in easily tuning the hyperparameters. We also illustrate these desirable features and the performance of Extended Affinity Propagation on various synthetic and real world datasets.
Humans, as the most powerful learners on the planet, have accumulated a lot of learning skills, such as learning through tests, interleaving learning, self-explanation, active recalling, to name a few. These learning skills and methodologies enable h umans to learn new topics more effectively and efficiently. We are interested in investigating whether humans learning skills can be borrowed to help machines to learn better. Specifically, we aim to formalize these skills and leverage them to train better machine learning (ML) models. To achieve this goal, we develop a general framework -- Skillearn, which provides a principled way to represent humans learning skills mathematically and use the formally-represented skills to improve the training of ML models. In two case studies, we apply Skillearn to formalize two learning skills of humans: learning by passing tests and interleaving learning, and use the formalized skills to improve neural architecture search. Experiments on various datasets show that trained using the skills formalized by Skillearn, ML models achieve significantly better performance.
This paper presents a locally decoupled network parameter learning with local propagation. Three elements are taken into account: (i) sets of nonlinear transforms that describe the representations at all nodes, (ii) a local objective at each node rel ated to the corresponding local representation goal, and (iii) a local propagation model that relates the nonlinear error vectors at each node with the goal error vectors from the directly connected nodes. The modeling concepts (i), (ii) and (iii) offer several advantages, including (a) a unified learning principle for any network that is represented as a graph, (b) understanding and interpretation of the local and the global learning dynamics, (c) decoupled and parallel parameter learning, (d) a possibility for learning in infinitely long, multi-path and multi-goal networks. Numerical experiments validate the potential of the learning principle. The preliminary results show advantages in comparison to the state-of-the-art methods, w.r.t. the learning time and the network size while having comparable recognition accuracy.
Continual learning aims to improve the ability of modern learning systems to deal with non-stationary distributions, typically by attempting to learn a series of tasks sequentially. Prior art in the field has largely considered supervised or reinforc ement learning tasks, and often assumes full knowledge of task labels and boundaries. In this work, we propose an approach (CURL) to tackle a more general problem that we will refer to as unsupervised continual learning. The focus is on learning representations without any knowledge about task identity, and we explore scenarios when there are abrupt changes between tasks, smooth transitions from one task to another, or even when the data is shuffled. The proposed approach performs task inference directly within the model, is able to dynamically expand to capture new concepts over its lifetime, and incorporates additional rehearsal-based techniques to deal with catastrophic forgetting. We demonstrate the efficacy of CURL in an unsupervised learning setting with MNIST and Omniglot, where the lack of labels ensures no information is leaked about the task. Further, we demonstrate strong performance compared to prior art in an i.i.d setting, or when adapting the technique to supervised tasks such as incremental class learning.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا