نقوم بتقييم فعالية علامات UPOS المتوقعة كيزات مدخلات لمحللات التبعية في إعدادات الموارد المنخفضة لتقييم كيفية تأثير حجم TreeBank على دقة وضع العلامات على أداء التحليل.نقوم بذلك للحصول على Treebanks Universal TreeBanks Universal Resource Universal Desultency، وبيانات الموارد منخفضة مصطنع بأحجام متفاوتة من TreeBank، وللغة Treebanks الصغيرة جدا بكميات متفاوتة من البيانات المعززة.نجد أن علامات UPOS المتوقعة مفيدة إلى حد ما بالنسبة إلى جانب Lowerbanks Lower Treebanks، خاصة عند توفر المزيد من الأشجار المشروح بالكامل.نجد أيضا أن هذا التأثير الإيجابي يقلل من زيادات البيانات.
We evaluate the efficacy of predicted UPOS tags as input features for dependency parsers in lower resource settings to evaluate how treebank size affects the impact tagging accuracy has on parsing performance. We do this for real low resource universal dependency treebanks, artificially low resource data with varying treebank sizes, and for very small treebanks with varying amounts of augmented data. We find that predicted UPOS tags are somewhat helpful for low resource treebanks, especially when fewer fully-annotated trees are available. We also find that this positive impact diminishes as the amount of data increases.
References used
https://aclanthology.org/
This paper presents our multilingual dependency parsing system as used in the IWPT 2021 Shared Task on Parsing into Enhanced Universal Dependencies. Our system consists of an unfactorized biaffine classifier that operates directly on fine-tuned XLM-R
We describe the EdinSaar submission to the shared task of Multilingual Low-Resource Translation for North Germanic Languages at the Sixth Conference on Machine Translation (WMT2021). We submit multilingual translation models for translations to/from
Incorporating multiple input modalities in a machine translation (MT) system is gaining popularity among MT researchers. Unlike the publicly available dataset for Multimodal Machine Translation (MMT) tasks, where the captions are short image descript
This paper describes a freely available web-based demonstrator called HB Deid. HB Deid identifies so-called protected health information, PHI, in a text written in Swedish and removes, masks, or replaces them with surrogates or pseudonyms. PHIs are n
Multimodal Machine Translation (MMT) enriches the source text with visual information for translation. It has gained popularity in recent years, and several pipelines have been proposed in the same direction. Yet, the task lacks quality datasets to i