نحن تصف أنظمة Utfpr المقدمة إلى تنبؤ التعقيد المعجمي المهمة المشتركة في Semeval 2021. إنهم يقومون بتنبؤ التعقيد من خلال الجمع بين الميزات الكلاسيكية، مثل تردد الكلمة، تردد N-Gram، طول الكلمة، وعدد الحواس، مع ناقلات Bert.نحن نختبر العديد من مجموعات الميزات ونماذج تعلم الآلات في تجاربنا وتجد أن ناقلات بيرت، حتى لو لم تكن محسنة للمهمة في متناول اليد، هي مكملة كبيرة للميزات الكلاسيكية.نجد أيضا أن استخدام مبدأ التكوين يمكن أن يساعد في تنبؤ تعقيد العبارة.أنظمتنا تضع 45 من أصل 55 من الكلمات الواحدة والثمانين من أصل 38 للعبارات.
We describe the UTFPR systems submitted to the Lexical Complexity Prediction shared task of SemEval 2021. They perform complexity prediction by combining classic features, such as word frequency, n-gram frequency, word length, and number of senses, with BERT vectors. We test numerous feature combinations and machine learning models in our experiments and find that BERT vectors, even if not optimized for the task at hand, are a great complement to classic features. We also find that employing the principle of compositionality can potentially help in phrase complexity prediction. Our systems place 45th out of 55 for single words and 29th out of 38 for phrases.
References used
https://aclanthology.org/
This paper presents the results and main findings of SemEval-2021 Task 1 - Lexical Complexity Prediction. We provided participants with an augmented version of the CompLex Corpus (Shardlow et al. 2020). CompLex is an English multi-domain corpus in wh
In this contribution, we describe the system presented by the PolyU CBS-Comp Team at the Task 1 of SemEval 2021, where the goal was the estimation of the complexity of words in a given sentence context. Our top system, based on a combination of lexic
This paper describes a system submitted by team BigGreen to LCP 2021 for predicting the lexical complexity of English words in a given context. We assemble a feature engineering-based model with a deep neural network model founded on BERT. While BERT
Lexical complexity plays an important role in reading comprehension. lexical complexity prediction (LCP) can not only be used as a part of Lexical Simplification systems, but also as a stand-alone application to help people better reading. This paper
This paper describes team LCP-RIT's submission to the SemEval-2021 Task 1: Lexical Complexity Prediction (LCP). The task organizers provided participants with an augmented version of CompLex (Shardlow et al., 2020), an English multi-domain dataset in