Research papers, master and doctoral theses about ترجمة غير مسندية

How Suitable Are Subword Segmentation Strategies for Translating Non-Concatenative Morphology?

211 - Association for Computation Linguistics 2021 مقالة

Data-driven subword segmentation has become the default strategy for open-vocabulary machine translation and other NLP tasks, but may not be sufficiently generic for optimal learning of non-concatenative morphology. We design a test suite to evaluate segmentation strategies on different types of morphological phenomena in a controlled, semi-synthetic setting. In our experiments, we compare how well machine translation models trained on subword- and character-level can translate these morphological phenomena. We find that learning to analyse and generate morphologically complex surface representations is still challenging, especially for non-concatenative morphological phenomena like reduplication or vowel harmony and for rare word stems. Based on our results, we recommend that novel text representation strategies be tested on a range of typologically diverse languages to minimise the risk of adopting a strategy that inadvertently disadvantages certain languages.

translating non-concatenative morphology non-concatenative morphology translating non-concatenative ترجمة غير متسلسل التشكل التشكل غير الملائم ترجمة غير مسندية صناعة حمض الفوسفور المزيد..

Rethinking Why Intermediate-Task Fine-Tuning Works

210 - Association for Computation Linguistics 2021 مقالة

Supplementary Training on Intermediate Labeled-data Tasks (STILT) is a widely applied technique, which first fine-tunes the pretrained language models on an intermediate task before on the target task of interest. While STILT is able to further impro ve the performance of pretrained language models, it is still unclear why and when it works. Previous research shows that those intermediate tasks involving complex inference, such as commonsense reasoning, work especially well for RoBERTa-large. In this paper, we discover that the improvement from an intermediate task could be orthogonal to it containing reasoning or other complex skills --- a simple real-fake discrimination task synthesized by GPT2 can benefit diverse target tasks. We conduct extensive experiments to study the impact of different factors on STILT. These findings suggest rethinking the role of intermediate fine-tuning in the STILT pipeline.

ترجمة غير مسندية intermediate labeled-data tasks intermediate-task fine-tuning works مهام البيانات الوسيطة المسامحة الأعمال الوسيطة تعمل بالضبط صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد