ترغب بنشر مسار تعليمي؟ اضغط هنا

Multitask Learning with Single Gradient Step Update for Task Balancing

70   0   0.0 ( 0 )
 نشر من قبل Youngdoo Son
 تاريخ النشر 2020
والبحث باللغة English




اسأل ChatGPT حول البحث

Multitask learning is a methodology to boost generalization performance and also reduce computational intensity and memory usage. However, learning multiple tasks simultaneously can be more difficult than learning a single task because it can cause imbalance among tasks. To address the imbalance problem, we propose an algorithm to balance between tasks at the gradient level by applying gradient-based meta-learning to multitask learning. The proposed method trains shared layers and task-specific layers separately so that the two layers with different roles in a multitask network can be fitted to their own purposes. In particular, the shared layer that contains informative knowledge shared among tasks is trained by employing single gradient step update and inner/outer loop training to mitigate the imbalance problem at the gradient level. We apply the proposed method to various multitask computer vision problems and achieve state-of-the-art performance.



قيم البحث

اقرأ أيضاً

We present our entry into the 2021 3C Shared Task Citation Context Classification based on Purpose competition. The goal of the competition is to classify a citation in a scientific article based on its purpose. This task is important because it coul d potentially lead to more comprehensive ways of summarizing the purpose and uses of scientific articles, but it is also difficult, mainly due to the limited amount of available training data in which the purposes of each citation have been hand-labeled, along with the subjectivity of these labels. Our entry in the competition is a multi-task model that combines multiple modules designed to handle the problem from different perspectives, including hand-generated linguistic features, TF-IDF features, and an LSTM-with-attention model. We also provide an ablation study and feature analysis whose insights could lead to future work.
We introduce a novel online multitask setting. In this setting each task is partitioned into a sequence of segments that is unknown to the learner. Associated with each segment is a hypothesis from some hypothesis class. We give algorithms that are d esigned to exploit the scenario where there are many such segments but significantly fewer associated hypotheses. We prove regret bounds that hold for any segmentation of the tasks and any association of hypotheses to the segments. In the single-task setting this is equivalent to switching with long-term memory in the sense of [Bousquet and Warmuth; 2003]. We provide an algorithm that predicts on each trial in time linear in the number of hypotheses when the hypothesis class is finite. We also consider infinite hypothesis classes from reproducing kernel Hilbert spaces for which we give an algorithm whose per trial time complexity is cubic in the number of cumulative trials. In the single-task special case this is the first example of an efficient regret-bounded switching algorithm with long-term memory for a non-parametric hypothesis class.
We develop a mathematical framework for solving multi-task reinforcement learning (MTRL) problems based on a type of policy gradient method. The goal in MTRL is to learn a common policy that operates effectively in different environments; these envir onments have similar (or overlapping) state spaces, but have different rewards and dynamics. We highlight two fundamental challenges in MTRL that are not present in its single task counterpart, and illustrate them with simple examples. We then develop a decentralized entropy-regularized policy gradient method for solving the MTRL problem, and study its finite-time convergence rate. We demonstrate the effectiveness of the proposed method using a series of numerical experiments. These experiments range from small-scale GridWorld problems that readily demonstrate the trade-offs involved in multi-task learning to large-scale problems, where common policies are learned to navigate an airborne drone in multiple (simulated) environments.
While deep learning and deep reinforcement learning (RL) systems have demonstrated impressive results in domains such as image classification, game playing, and robotic control, data efficiency remains a major challenge. Multi-task learning has emerg ed as a promising approach for sharing structure across multiple tasks to enable more efficient learning. However, the multi-task setting presents a number of optimization challenges, making it difficult to realize large efficiency gains compared to learning tasks independently. The reasons why multi-task learning is so challenging compared to single-task learning are not fully understood. In this work, we identify a set of three conditions of the multi-task optimization landscape that cause detrimental gradient interference, and develop a simple yet general approach for avoiding such interference between task gradients. We propose a form of gradient surgery that projects a tasks gradient onto the normal plane of the gradient of any other task that has a conflicting gradient. On a series of challenging multi-task supervised and multi-task RL problems, this approach leads to substantial gains in efficiency and performance. Further, it is model-agnostic and can be combined with previously-proposed multi-task architectures for enhanced performance.
Most deep reinforcement learning algorithms are data inefficient in complex and rich environments, limiting their applicability to many scenarios. One direction for improving data efficiency is multitask learning with shared neural network parameters , where efficiency may be improved through transfer across related tasks. In practice, however, this is not usually observed, because gradients from different tasks can interfere negatively, making learning unstable and sometimes even less data efficient. Another issue is the different reward schemes between tasks, which can easily lead to one task dominating the learning of a shared model. We propose a new approach for joint training of multiple tasks, which we refer to as Distral (Distill & transfer learning). Instead of sharing parameters between the different workers, we propose to share a distilled policy that captures common behaviour across tasks. Each worker is trained to solve its own task while constrained to stay close to the shared policy, while the shared policy is trained by distillation to be the centroid of all task policies. Both aspects of the learning process are derived by optimizing a joint objective function. We show that our approach supports efficient transfer on complex 3D environments, outperforming several related methods. Moreover, the proposed learning process is more robust and more stable---attributes that are critical in deep reinforcement learning.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا