UNBNLP at SemEval-2021 Task 1: Predicting lexical complexity with masked language models and character-level encoders


Abstract in English

In this paper, we present three supervised systems for English lexical complexity prediction of single and multiword expressions for SemEval-2021 Task 1. We explore the use of statistical baseline features, masked language models, and character-level encoders to predict the complexity of a target token in context. Our best system combines information from these three sources. The results indicate that information from masked language models and character-level encoders can be combined to improve lexical complexity prediction.

References used

https://aclanthology.org/

Download