Research papers, master and doctoral theses about transformer language

Hyperparameter Power Impact in Transformer Language Model Training

128 - Association for Computation Linguistics 2021 مقالة

Training large language models can consume a large amount of energy. We hypothesize that the language model's configuration impacts its energy consumption, and that there is room for power consumption optimisation in modern large language models. To investigate these claims, we introduce a power consumption factor to the objective function, and explore the range of models and hyperparameter configurations that affect power. We identify multiple configuration factors that can reduce power consumption during language model training while retaining model quality.

transformer language model language model training transformer language نموذج لغة المحول تدريب نموذج اللغة لغة المحول صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد