Research papers, master and doctoral theses about المعلومات تحسن عبر اللغات

Regularising Fisher Information Improves Cross-lingual Generalisation

223 - Association for Computation Linguistics 2021 مقالة

Many recent works use consistency regularisation' to improve the generalisation of fine-tuned pre-trained models, both multilingual and English-only. These works encourage model outputs to be similar between a perturbed and normal version of the inpu t, usually via penalising the Kullback--Leibler (KL) divergence between the probability distribution of the perturbed and normal model. We believe that consistency losses may be implicitly regularizing the loss landscape. In particular, we build on work hypothesising that implicitly or explicitly regularizing trace of the Fisher Information Matrix (FIM), amplifies the implicit bias of SGD to avoid memorization. Our initial results show both empirically and theoretically that consistency losses are related to the FIM, and show that the flat minima implied by a small trace of the FIM improves performance when fine-tuning a multilingual model on additional languages. We aim to confirm these initial results on more datasets, and use our insights to develop better multilingual fine-tuning techniques.

improves cross-lingual generalisation information improves cross-lingual regularising fisher information يحسن التعميم عبر اللغات المعلومات تحسن عبر اللغات تنظيم معلومات فيشر صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد