تجزئة خطاب وقطع الخطاب على مستوى الجملة تلعب أدوارا مهمة لمختلف مهام NLP للنظر في التماسك النصي.على الرغم من الإنجازات الأخيرة في كلا المهام، لا يزال هناك مجال للتحسين بسبب ندرة البيانات المسمى.لحل المشكلة، نقترح مصنف إنتاج نموذجي في اللغة (LMGC) لاستخدام مزيد من المعلومات من الملصقات عن طريق معالجة الملصقات كمدخلات أثناء تعزيز تمثيلات التسمية من خلال تضمين أوصاف لكل ملصق.علاوة على ذلك، نظرا لأن هذا يتيح LMGC من إعداد تمثيلات الملصقات، غير المرئي في خطوة ما قبل التدريب، يمكننا استخدام نموذج لغة مدرب مسبقا في LMGC.تظهر النتائج التجريبية على DTSET RST-DT أن LMGC حققت النتيجة F1 من أصل 96.72 في تجزئة الخطاب.وقد حقق المزيد من درجات الولاية F1 عشرات من 84.69 مع حدود الذهب EDU و 81.18 مع حدود مجزأة تلقائيا، على التوالي، في تحليل خطاب على مستوى الجملة.
Discourse segmentation and sentence-level discourse parsing play important roles for various NLP tasks to consider textual coherence. Despite recent achievements in both tasks, there is still room for improvement due to the scarcity of labeled data. To solve the problem, we propose a language model-based generative classifier (LMGC) for using more information from labels by treating the labels as an input while enhancing label representations by embedding descriptions for each label. Moreover, since this enables LMGC to make ready the representations for labels, unseen in the pre-training step, we can effectively use a pre-trained language model in LMGC. Experimental results on the RST-DT dataset show that our LMGC achieved the state-of-the-art F1 score of 96.72 in discourse segmentation. It further achieved the state-of-the-art relation F1 scores of 84.69 with gold EDU boundaries and 81.18 with automatically segmented boundaries, respectively, in sentence-level discourse parsing.
References used
https://aclanthology.org/
Arabic sentiment analysis research existing currently is very limited. While sentiment analysis has many applications in English, the Arabic language is still recognizing its early steps in this field. In this paper, we show an application
on Arabic
Discourse analysis has long been known to be fundamental in natural language processing. In this research, we present our insight on discourse-level topic chain (DTC) parsing which aims at discovering new topics and investigating how these topics evo
Document-level event extraction is critical to various natural language processing tasks for providing structured information. Existing approaches by sequential modeling neglect the complex logic structures for long texts. In this paper, we leverage
Natural Language Inference (NLI) has garnered significant attention in recent years; however, the promise of applying NLI breakthroughs to other downstream NLP tasks has remained unfulfilled. In this work, we use the multiple-choice reading comprehen
This paper describes the system developed by the Laboratoire d'analyse statistique des textes for the Dravidian Language Identification (DLI) shared task of VarDial 2021. This task is particularly difficult because the materials consists of short You