Do you want to publish a course? Click here

In this paper, we present our contribution in SemEval-2021 Task 1: Lexical Complexity Prediction, where we integrate linguistic, statistical, and semantic properties of the target word and its context as features within a Machine Learning (ML) framew ork for predicting lexical complexity. In particular, we use BERT contextualized word embeddings to represent the semantic meaning of the target word and its context. We participated in the sub-task of predicting the complexity score of single words
In this paper, we present our systems submitted to SemEval-2021 Task 1 on lexical complexity prediction.The aim of this shared task was to create systems able to predict the lexical complexity of word tokens and bigram multiword expressions within a given sentence context, a continuous value indicating the difficulty in understanding a respective utterance. Our approach relies on gradient boosted regression tree ensembles fitted using a heterogeneous feature set combining linguistic features, static and contextualized word embeddings, psycholinguistic norm lexica, WordNet, word- and character bigram frequencies and inclusion in wordlists to create a model able to assign a word or multiword expression a context-dependent complexity score. We can show that especially contextualised string embeddings can help with predicting lexical complexity.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا