Towards All-around Knowledge Transferring: Learning From Task-irrelevant Labels

121 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Yinghui Li

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Yinghui Li - Ruiyang Liu - ZiHao Zhang

التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Deep neural models have hitherto achieved significant performances on numerous classification tasks, but meanwhile require sufficient manually annotated data. Since it is extremely time-consuming and expensive to annotate adequate data for each classification task, learning an empirically effective model with generalization on small dataset has received increased attention. Existing efforts mainly focus on transferring task-relevant knowledge from other similar data to tackle the issue. These approaches have yielded remarkable improvements, yet neglecting the fact that the task-irrelevant features could bring out massive negative transfer effects. To date, no large-scale studies have been performed to investigate the impact of task-irrelevant features, let alone the utilization of this kind of features. In this paper, we firstly propose Task-Irrelevant Transfer Learning (TIRTL) to exploit task-irrelevant features, which mainly are extracted from task-irrelevant labels. Particularly, we suppress the expression of task-irrelevant information and facilitate the learning process of classification. We also provide a theoretical explanation of our method. In addition, TIRTL does not conflict with those that have previously exploited task-relevant knowledge and can be well combined to enable the simultaneous utilization of task-relevant and task-irrelevant features for the first time. In order to verify the effectiveness of our theory and method, we conduct extensive experiments on facial expression recognition and digit recognition tasks. Our source code will be also available in the future for reproducibility.

قيم البحث

105 - Yinghui Li , Ruiyang Liu , Chen Wang 2021

Learning an empirically effective model with generalization using limited data is a challenging task for deep neural networks. In this paper, we propose a novel learning framework called PurifiedLearning to exploit task-irrelevant features extracted from task-irrelevant labels when training models on small-scale datasets. Particularly, we purify feature representations by using the expression of task-irrelevant information, thus facilitating the learning process of classification. Our work is built on solid theoretical analysis and extensive experiments, which demonstrate the effectiveness of PurifiedLearning. According to the theory we proved, PurifiedLearning is model-agnostic and doesnt have any restrictions on the model needed, so it can be combined with any existing deep neural networks with ease to achieve better performance. The source code of this paper will be available in the future for reproducibility.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط

Transferring Knowledge across Learning Processes

264 - Sebastian Flennerhag , Pablo G. Moreno , Neil D. Lawrence 2018

In complex transfer learning scenarios new tasks might not be tightly linked to previous tasks. Approaches that transfer information contained only in the final parameters of a source model will therefore struggle. Instead, transfer learning at a hig her level of abstraction is needed. We propose Leap, a framework that achieves this by transferring knowledge across learning processes. We associate each task with a manifold on which the training process travels from initialization to final parameters and construct a meta-learning objective that minimizes the expected length of this path. Our framework leverages only information obtained during training and can be computed on the fly at negligible cost. We demonstrate that our framework outperforms competing methods, both in meta-learning and transfer learning, on a set of computer vision tasks. Finally, we demonstrate that Leap can transfer knowledge across learning processes in demanding reinforcement learning environments (Atari) that involve millions of gradient steps.

التعلم الآلي الذكاء الاصطناعي الحوسبة العصبية والتطورية

Interactive Learning from Multiple Noisy Labels

136 - Shankar Vembu , Sandra Zilles 2016

Interactive learning is a process in which a machine learning algorithm is provided with meaningful, well-chosen examples as opposed to randomly chosen examples typical in standard supervised learning. In this paper, we propose a new method for inter active learning from multiple noisy labels where we exploit the disagreement among annotators to quantify the easiness (or meaningfulness) of an example. We demonstrate the usefulness of this method in estimating the parameters of a latent variable classification model, and conduct experimental analyses on a range of synthetic and benchmark datasets. Furthermore, we theoretically analyze the performance of perceptron in this interactive learning framework.

التعلم الآلي التعلم الالي

Transferring Knowledge Distillation for Multilingual Social Event Detection

190 - Jiaqian Ren , Hao Peng , Lei Jiang 2021

Recently published graph neural networks (GNNs) show promising performance at social event detection tasks. However, most studies are oriented toward monolingual data in languages with abundant training samples. This has left the more common multilin gual settings and lesser-spoken languages relatively unexplored. Thus, we present a GNN that incorporates cross-lingual word embeddings for detecting events in multilingual data streams. The first exploit is to make the GNN work with multilingual data. For this, we outline a construction strategy that aligns messages in different languages at both the node and semantic levels. Relationships between messages are established by merging entities that are the same but are referred to in different languages. Non-English message representations are converted into English semantic space via the cross-lingual word embeddings. The resulting message graph is then uniformly encoded by a GNN model. In special cases where a lesser-spoken language needs to be detected, a novel cross-lingual knowledge distillation framework, called CLKD, exploits prior knowledge learned from similar threads in English to make up for the paucity of annotated data. Experiments on both synthetic and real-world datasets show the framework to be highly effective at detection in both multilingual data and in languages where training samples are scarce.

التعلم الآلي

Towards Interpretable Multi-Task Learning Using Bilevel Programming

128 - Francesco Alesiani , Shujian Yu , Ammar Shaker 2020

Interpretable Multi-Task Learning can be expressed as learning a sparse graph of the task relationship based on the prediction performance of the learned models. Since many natural phenomenon exhibit sparse structures, enforcing sparsity on learned m odels reveals the underlying task relationship. Moreover, different sparsification degrees from a fully connected graph uncover various types of structures, like cliques, trees, lines, clusters or fully disconnected graphs. In this paper, we propose a bilevel formulation of multi-task learning that induces sparse graphs, thus, revealing the underlying task relationships, and an efficient method for its computation. We show empirically how the induced sparse graph improves the interpretability of the learned models and their relationship on synthetic and real data, without sacrificing generalization performance. Code at https://bit.ly/GraphGuidedMTL

التعلم الآلي التعلم الالي