Leveraging Transfer Learning for Reliable Intelligence Identification on Vietnamese SNSs (ReINTEL)

232 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Long Phan

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Trung-Hieu Tran - Long Phan - Truong-Son Nguyen

الحساب واللغة الذكاء الاصطناعي التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This paper proposed several transformer-based approaches for Reliable Intelligence Identification on Vietnamese social network sites at VLSP 2020 evaluation campaign. We exploit both of monolingual and multilingual pre-trained models. Besides, we utilize the ensemble method to improve the robustness of different approaches. Our team achieved a score of 0.9378 at ROC-AUC metric in the private test set which is competitive to other participants.

قيم البحث

147 - Kim Thi-Thanh Nguyen , Kiet Van Nguyen 2021

This paper presents the system that we propose for the Reliable Intelligence Indentification on Vietnamese Social Network Sites (ReINTEL) task of the Vietnamese Language and Speech Processing 2020 (VLSP 2020) Shared Task. In this task, the VLSP 2020 provides a dataset with approximately 6,000 trainning news/posts annotated with reliable or unreliable labels, and a test set consists of 2,000 examples without labels. In this paper, we conduct experiments on different transfer learning models, which are bert4news and PhoBERT fine-tuned to predict whether the news is reliable or not. In our experiments, we achieve the AUC score of 94.52% on the private test set from ReINTELs organizers.

الحساب واللغة

NLPBK at VLSP-2020 shared task: Compose transformer pretrained models for Reliable Intelligence Identification on Social network

139 - Thanh Chinh Nguyen , Van Nha Nguyen 2021

This paper describes our method for tuning a transformer-based pretrained model, to adaptation with Reliable Intelligence Identification on Vietnamese SNSs problem. We also proposed a model that combines bert-base pretrained models with some metadata features, such as the number of comments, number of likes, images of SNS documents,... to improved results for VLSP shared task: Reliable Intelligence Identification on Vietnamese SNSs. With appropriate training techniques, our model is able to achieve 0.9392 ROC-AUC on public test set and the final version settles at top 2 ROC-AUC (0.9513) on private test set.

الحساب واللغة استرجاع المعلومات

Deep Learning versus Traditional Classifiers on Vietnamese Students Feedback Corpus

90 - Phu X. V. Nguyen , Tham T. T. Hong , Kiet Van Nguyen 2019

Students feedback is an important source of collecting students opinions to improve the quality of training activities. Implementing sentiment analysis into student feedback data, we can determine sentiments polarities which express all problems in t he institution since changes necessary will be applied to improve the quality of teaching and learning. This study focused on machine learning and natural language processing techniques (NaiveBayes, Maximum Entropy, Long Short-Term Memory, Bi-Directional Long Short-Term Memory) on the VietnameseStudents Feedback Corpus collected from a university. The final results were compared and evaluated to find the most effective model based on different evaluation criteria. The experimental results show that the Bi-Directional LongShort-Term Memory algorithm outperformed than three other algorithms in terms of the F1-score measurement with 92.0% on the sentiment classification task and 89.6% on the topic classification task. In addition, we developed a sentiment analysis application analyzing student feedback. The application will help the institution to recognize students opinions about a problem and identify shortcomings that still exist. With the use of this application, the institution can propose an appropriate method to improve the quality of training activities in the future.

الحساب واللغة أجهزة الكمبيوتر والمجتمع التعلم الآلي

Deep Learning for Text Style Transfer: A Survey

134 - Di Jin , Zhijing Jin , Zhiting Hu 2020

Text style transfer (TST) is an important task in natural language generation (NLG), which aims to control certain attributes in the generated text, such as politeness, emotion, humor, and many others. It has a long history in the field of natural la nguage processing (NLP), and recently has re-gained significant attention thanks to the promising performance brought by deep neural models. In this paper, we present a systematic survey of the research on neural text style transfer, spanning over 100 representative articles since the first neural text style transfer work in 2017. We discuss the task formulation, existing datasets and subtasks, evaluation, as well as the rich methodologies in the presence of parallel and non-parallel data. We also provide discussions on a variety of important topics regarding the future development of TST. Our curated paper list is at https://github.com/zhijing-jin/Text_Style_Transfer_Survey

الحساب واللغة الذكاء الاصطناعي التعلم الآلي

Curious Representation Learning for Embodied Intelligence

166 - Yilun Du , Chuang Gan , Phillip Isola 2021

Self-supervised representation learning has achieved remarkable success in recent years. By subverting the need for supervised labels, such approaches are able to utilize the numerous unlabeled images that exist on the Internet and in photographic da tasets. Yet to build truly intelligent agents, we must construct representation learning algorithms that can learn not only from datasets but also learn from environments. An agent in a natural environment will not typically be fed curated data. Instead, it must explore its environment to acquire the data it will learn from. We propose a framework, curious representation learning (CRL), which jointly learns a reinforcement learning policy and a visual representation model. The policy is trained to maximize the error of the representation learner, and in doing so is incentivized to explore its environment. At the same time, the learned representation becomes stronger and stronger as the policy feeds it ever harder data to learn from. Our learned representations enable promising transfer to downstream navigation tasks, performing better than or comparably to ImageNet pretraining without using any supervision at all. In addition, despite being trained in simulation, our learned representations can obtain interpretable results on real images. Code is available at https://yilundu.github.io/crl/.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي التعلم الآلي