Dialect Identification through Adversarial Learning and Knowledge Distillation on Romanian BERT

published by Association for Computation Linguistics in 2021 in Artificial Intelligence and research's language is English Download

Abstract in English

Dialect identification is a task with applicability in a vast array of domains, ranging from automatic speech recognition to opinion mining. This work presents our architectures used for the VarDial 2021 Romanian Dialect Identification subtask. We introduced a series of solutions based on Romanian or multilingual Transformers, as well as adversarial training techniques. At the same time, we experimented with a knowledge distillation tool in order to check whether a smaller model can maintain the performance of our best approach. Our best solution managed to obtain a weighted F1-score of 0.7324, allowing us to obtain the 2nd place on the leaderboard.

References used

https://aclanthology.org/

Download