ترغب بنشر مسار تعليمي؟ اضغط هنا

Dynamically Addressing Unseen Rumor via Continual Learning

85   0   0.0 ( 0 )
 نشر من قبل Nayeon Lee
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Rumors are often associated with newly emerging events, thus, an ability to deal with unseen rumors is crucial for a rumor veracity classification model. Previous works address this issue by improving the models generalizability, with an assumption that the model will stay unchanged even after the new outbreak of an event. In this work, we propose an alternative solution to continuously update the model in accordance with the dynamics of rumor domain creations. The biggest technical challenge associated with this new approach is the catastrophic forgetting of previous learnings due to new learnings. We adopt continual learning strategies that control the new learnings to avoid catastrophic forgetting and propose an additional strategy that can jointly be used to strengthen the forgetting alleviation.



قيم البحث

اقرأ أيضاً

We present a method for continual learning of speech representations for multiple languages using self-supervised learning (SSL) and applying these for automatic speech recognition. There is an abundance of unannotated speech, so creating self-superv ised representations from raw audio and finetuning on a small annotated datasets is a promising direction to build speech recognition systems. Wav2vec models perform SSL on raw audio in a pretraining phase and then finetune on a small fraction of annotated data. SSL models have produced state of the art results for ASR. However, these models are very expensive to pretrain with self-supervision. We tackle the problem of learning new language representations continually from audio without forgetting a previous language representation. We use ideas from continual learning to transfer knowledge from a previous task to speed up pretraining a new language task. Our continual-wav2vec2 model can decrease pretraining times by 32% when learning a new language task, and learn this new audio-language representation without forgetting previous language representation.
While neural networks are powerful function approximators, they suffer from catastrophic forgetting when the data distribution is not stationary. One particular formalism that studies learning under non-stationary distribution is provided by continua l learning, where the non-stationarity is imposed by a sequence of distinct tasks. Most methods in this space assume, however, the knowledge of task boundaries, and focus on alleviating catastrophic forgetting. In this work, we depart from this view and move the focus towards faster remembering -- i.e measuring how quickly the network recovers performance rather than measuring the networks performance without any adaptation. We argue that in many settings this can be more effective and that it opens the door to combining meta-learning and continual learning techniques, leveraging their complementary advantages. We propose a framework specific for the scenario where no information about task boundaries or task identity is given. It relies on a separation of concerns into what task is being solved and how the task should be solved. This framework is implemented by differentiating task specific parameters from task agnostic parameters, where the latter are optimized in a continual meta learning fashion, without access to multiple tasks at the same time. We showcase this framework in a supervised learning scenario and discuss the implication of the proposed formalism.
Continual learning aims to learn new tasks incrementally using less computation and memory resources instead of retraining the model from scratch whenever new task arrives. However, existing approaches are designed in supervised fashion assuming all data from new tasks have been manually annotated, which are not practical for many real-life applications. In this work, we propose to use pseudo label instead of the ground truth to make continual learning feasible in unsupervised mode. The pseudo labels of new data are obtained by applying global clustering algorithm and we propose to use the model updated from last incremental step as the feature extractor. Due to the scarcity of existing work, we introduce a new benchmark experimental protocol for unsupervised continual learning of image classification task under class-incremental setting where no class label is provided for each incremental learning step. Our method is evaluated on the CIFAR-100 and ImageNet (ILSVRC) datasets by incorporating the pseudo label with various existing supervised approaches and show promising results in unsupervised scenario.
Meta continual learning algorithms seek to train a model when faced with similar tasks observed in a sequential manner. Despite promising methodological advancements, there is a lack of theoretical frameworks that enable analysis of learning challeng es such as generalization and catastrophic forgetting. To that end, we develop a new theoretical approach for meta continual learning~(MCL) where we mathematically model the learning dynamics using dynamic programming, and we establish conditions of optimality for the MCL problem. Moreover, using the theoretical framework, we derive a new dynamic-programming-based MCL method that adopts stochastic-gradient-driven alternating optimization to balance generalization and catastrophic forgetting. We show that, on MCL benchmark data sets, our theoretically grounded method achieves accuracy better than or comparable to that of existing state-of-the-art methods.
The task of Knowledge Graph Completion (KGC) aims to automatically infer the missing fact information in Knowledge Graph (KG). In this paper, we take a new perspective that aims to leverage rich user-item interaction data (user interaction data for s hort) for improving the KGC task. Our work is inspired by the observation that many KG entities correspond to online items in application systems. However, the two kinds of data sources have very different intrinsic characteristics, and it is likely to hurt the original performance using simple fusion strategy. To address this challenge, we propose a novel adversarial learning approach by leveraging user interaction data for the KGC task. Our generator is isolated from user interaction data, and serves to improve the performance of the discriminator. The discriminator takes the learned useful information from user interaction data as input, and gradually enhances the evaluation capacity in order to identify the fake samples generated by the generator. To discover implicit entity preference of users, we design an elaborate collaborative learning algorithms based on graph neural networks, which will be jointly optimized with the discriminator. Such an approach is effective to alleviate the issues about data heterogeneity and semantic complexity for the KGC task. Extensive experiments on three real-world datasets have demonstrated the effectiveness of our approach on the KGC task.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا