في هذه الورقة، نحقق في فعالية استخدام المدينات السياقية من بيرت متعددة اللغات، بيرت الألمانية في تحديد تعليقات المطالبة بالحقائق باللغة الألمانية على وسائل التواصل الاجتماعي.بالإضافة إلى ذلك، ندرس تأثير صياغة مشكلة التصنيف كأداة تعليمية متعددة المهام، حيث يحدد النموذج السمية ومشاركة التعليق بالإضافة إلى تحديد ما إذا كان يدعي الحقيقة.نحن نقدم مقارنة شاملة من النماذج التي تستند إلى بيرت مقارنة بناسي الانحدار اللوجستي وإظهار أن ميزات بيرت الألمانية المدربة باستخدام هدف متعدد المهام يحقق أفضل درجة F1 في مجموعة الاختبار.تم تنفيذ هذا العمل كجزء من تقديم المهمة المشتركة ل Germeval 2021 بشأن تحديد تعليقات المطالبة بالحقائق.
In this paper we investigate the efficacy of using contextual embeddings from multilingual BERT and German BERT in identifying fact-claiming comments in German on social media. Additionally, we examine the impact of formulating the classification problem as a multi-task learning problem, where the model identifies toxicity and engagement of the comment in addition to identifying whether it is fact-claiming. We provide a thorough comparison of the two BERT based models compared with a logistic regression baseline and show that German BERT features trained using a multi-task objective achieves the best F1 score on the test set. This work was done as part of a submission to GermEval 2021 shared task on the identification of fact-claiming comments.
References used
https://aclanthology.org/
The availability of language representations learned by large pretrained neural network models (such as BERT and ELECTRA) has led to improvements in many downstream Natural Language Processing tasks in recent years. Pretrained models usually differ i
In this work, we present our approaches on the toxic comment classification task (subtask 1) of the GermEval 2021 Shared Task. For this binary task, we propose three models: a German BERT transformer model; a multilayer perceptron, which was first tr
In this paper we present UPAppliedCL's contribution to the GermEval 2021 Shared Task. In particular, we participated in Subtasks 2 (Engaging Comment Classification) and 3 (Fact-Claiming Comment Classification). While acceptable results can be obtaine
We report on our submission to Task 1 of the GermEval 2021 challenge -- toxic comment classification. We investigate different ways of bolstering scarce training data to improve off-the-shelf model performance on a toxic comment classification task.
In this paper, we report on our approach to addressing the GermEval 2021 Shared Task on the Identification of Toxic, Engaging, and Fact-Claiming Comments for the German language. We submitted three runs for each subtask based on ensembles of three mo