Research papers, master and doctoral theses about filtering of neural

Sampling and Filtering of Neural Machine Translation Distillation Data

208 - Association for Computation Linguistics 2021 مقالة

In most of neural machine translation distillation or stealing scenarios, the highest-scoring hypothesis of the target model (teacher) is used to train a new model (student). If reference translations are also available, then better hypotheses (with respect to the references) can be oversampled and poor hypotheses either removed or undersampled. This paper explores the sampling method landscape (pruning, hypothesis oversampling and undersampling, deduplication and their combination) with English to Czech and English to German MT models using standard MT evaluation metrics. We show that careful oversampling and combination with the original data leads to better performance when compared to training only on the original or synthesized data or their direct combination.

growdsourcing اللغة الطبيعية machine translation distillation filtering of neural جهاز التقطير الترجمة تصفية العصبية صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد