يهدف العمل الحالي إلى تعيين درجة التعقيد بين 0 و 1 كلمة أو عبارة مستهدفة في جملة معينة.بالنسبة لكل هدف لكلمة واحدة، يتم تدريب Rame Forest Regressor على مجموعة ميزة تتكون من معلومات معجمية ودلالة وندرة حول الهدف.بالنسبة لكل هدف متعدد الكلمات، يتم أخذ مجموعة من ميزات الكلمات الفردية مع تعقيدات كلمة واحدة في مساحة الميزة.أسفر النظام عن ارتباط بيرسون ب 0.7402 و 0.8244 في مجموعة الاختبار للأهداف الفردية ومتعددة الكلمات، على التوالي.
The present work aims at assigning a complexity score between 0 and 1 to a target word or phrase in a given sentence. For each Single Word Target, a Random Forest Regressor is trained on a feature set consisting of lexical, semantic, and syntactic information about the target. For each Multiword Target, a set of individual word features is taken along with single word complexities in the feature space. The system yielded the Pearson correlation of 0.7402 and 0.8244 on the test set for the Single and Multiword Targets, respectively.
References used
https://aclanthology.org/
This paper presents our system for the single- and multi-word lexical complexity prediction tasks of SemEval Task 1: Lexical Complexity Prediction. Text comprehension depends on the reader's ability to understand the words present in it; evaluating t
Evaluating the complexity of a target word in a sentential context is the aim of the Lexical Complexity Prediction task at SemEval-2021. This paper presents the system created to assess single words lexical complexity, combining linguistic and psycho
This paper presents the results and main findings of SemEval-2021 Task 1 - Lexical Complexity Prediction. We provided participants with an augmented version of the CompLex Corpus (Shardlow et al. 2020). CompLex is an English multi-domain corpus in wh
This paper describes team LCP-RIT's submission to the SemEval-2021 Task 1: Lexical Complexity Prediction (LCP). The task organizers provided participants with an augmented version of CompLex (Shardlow et al., 2020), an English multi-domain dataset in
This paper describes our submission to the SemEval-2021 shared task on Lexical Complexity Prediction. We approached it as a regression problem and present an ensemble combining four systems, one feature-based and three neural with fine-tuning, freque