نقدم شبكتين نفعي نفعي للتنبؤ بتعقيد الكلمات والعبارات في السياق على نطاق مستمر.كلا النموذجين يستخدم كلا الكلمة والشخصيات إلى جانب ميزات معجمية كمدخلات.يعرض نظامنا نتائج معقولة مع ارتباط بيرسون من 0.7754 في المهمة ككل.نحن نسلط الضوء على قيود هذه الطريقة في تقييم سياق النص المستهدف بشكل صحيح، واستكشاف فعالية كل من النظم عبر مجموعة من الأنواع.تم تقديم كلا النموذجين كجزء من LCP 2021، والذي يركز على تحديد الكلمات والعبارات المعقدة باعتبارها مهمة تعتمد على السياق، وهي مهمة قائمة على الانحدار.
We present two convolutional neural networks for predicting the complexity of words and phrases in context on a continuous scale. Both models utilize word and character embeddings alongside lexical features as inputs. Our system displays reasonable results with a Pearson correlation of 0.7754 on the task as a whole. We highlight the limitations of this method in properly assessing the context of the target text, and explore the effectiveness of both systems across a range of genres. Both models were submitted as part of LCP 2021, which focuses on the identification of complex words and phrases as a context dependent, regression based task.
References used
https://aclanthology.org/
This paper revisits feature engineering approaches for predicting the complexity level of English words in a particular context using regression techniques. Our best submission to the Lexical Complexity Prediction (LCP) shared task was ranked 3rd out
This paper describes our contribution to SemEval 2021 Task 1 (Shardlow et al., 2021): Lexical Complexity Prediction. In our approach, we leverage the ELECTRA model and attempt to mirror the data annotation scheme. Although the task is a regression ta
This article describes a system to predict the complexity of words for the Lexical Complexity Prediction (LCP) shared task hosted at SemEval 2021 (Task 1) with a new annotated English dataset with a Likert scale. Located in the Lexical Semantics trac
This paper describes the system developed by the Laboratoire d'analyse statistique des textes (LAST) for the Lexical Complexity Prediction shared task at SemEval-2021. The proposed system is made up of a LightGBM model fed with features obtained from
In this paper, we propose a method of fusing sentence information and word frequency information for the SemEval 2021 Task 1-Lexical Complexity Prediction (LCP) shared task. In our system, the sentence information comes from the RoBERTa model, and th