تصف هذه الورقة نظام مقدم من فريق Biggreen إلى LCP 2021 للتنبؤ بالتعقيد المعجمي للكلمات الإنجليزية في سياق معين.نحن نكرب نموذجا يعتمد على الهندسة مع نموذج شبكة عصبي عميق تأسست على بيرتف.بينما ينفذ بيرت نفسها بشكل تنافسي، فإن نموذجنا القائم على الهندسة يساعد في الحالات القصوى، على سبيل المثال.فصل حالات الصعوبة السهلة والمحايدة.تضم ميزاتنا المصنوعة يدويا اتساعا من التدابير الصوفية المعجمية والدلية والمعنية والرواية.تقدم تصورات خرائط بيرت اهتماما نظرة ثاقبة للميزات المحتملة التي قد تتعلمها نماذج المحولات عند ضبطها من أجل تنبؤ التعقيد المعجمي.تنقيح تنبؤاتنا المعقولة بشكل معقول بالنسبة للكلمة الفرعية الواحدة، ونظهر كيف يمكن تسخيرها لأداء الاستاحا الفرعي للتعبير المتعدد الآن.
This paper describes a system submitted by team BigGreen to LCP 2021 for predicting the lexical complexity of English words in a given context. We assemble a feature engineering-based model with a deep neural network model founded on BERT. While BERT itself performs competitively, our feature engineering-based model helps in extreme cases, eg. separating instances of easy and neutral difficulty. Our handcrafted features comprise a breadth of lexical, semantic, syntactic, and novel phonological measures. Visualizations of BERT attention maps offer insight into potential features that Transformers models may learn when fine-tuned for lexical complexity prediction. Our ensembled predictions score reasonably well for the single word subtask, and we demonstrate how they can be harnessed to perform well on the multi word expression subtask too.
References used
https://aclanthology.org/
This paper presents the results and main findings of SemEval-2021 Task 1 - Lexical Complexity Prediction. We provided participants with an augmented version of the CompLex Corpus (Shardlow et al. 2020). CompLex is an English multi-domain corpus in wh
Lexical complexity plays an important role in reading comprehension. lexical complexity prediction (LCP) can not only be used as a part of Lexical Simplification systems, but also as a stand-alone application to help people better reading. This paper
In this contribution, we describe the system presented by the PolyU CBS-Comp Team at the Task 1 of SemEval 2021, where the goal was the estimation of the complexity of words in a given sentence context. Our top system, based on a combination of lexic
Lexical complexity prediction (LCP) conveys the anticipation of the complexity level of a token or a set of tokens in a sentence. It plays a vital role in the improvement of various NLP tasks including lexical simplification, translations, and text g
In this paper, we present our contribution in SemEval-2021 Task 1: Lexical Complexity Prediction, where we integrate linguistic, statistical, and semantic properties of the target word and its context as features within a Machine Learning (ML) framew