Research papers, master and doctoral theses about التدرج المحاذاة المتبادلة

GAML-BERT: Improving BERT Early Exiting by Gradient Aligned Mutual Learning

268 - Association for Computation Linguistics 2021 مقالة

In this work, we propose a novel framework, Gradient Aligned Mutual Learning BERT (GAML-BERT), for improving the early exiting of BERT. GAML-BERT's contributions are two-fold. We conduct a set of pilot experiments, which shows that mutual knowledge d istillation between a shallow exit and a deep exit leads to better performances for both. From this observation, we use mutual learning to improve BERT's early exiting performances, that is, we ask each exit of a multi-exit BERT to distill knowledge from each other. Second, we propose GA, a novel training method that aligns the gradients from knowledge distillation to cross-entropy losses. Extensive experiments are conducted on the GLUE benchmark, which shows that our GAML-BERT can significantly outperform the state-of-the-art (SOTA) BERT early exiting methods.

aligned mutual learning gradient aligned mutual bert early exiting محاذاة التعلم المتبادل التدرج المحاذاة المتبادلة بيرت المبكر الخروج صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد