تصف هذه الورقة تقديم فريق CU-UBC لمهمة SIGMORPHON 2021 المشتركة 2: تجميع النماذج المورفولوجية غير المنصوص عليها.يولد نظامنا النماذج باستخدام قواعد التحول المورفولوجية التي يتم اكتشافها من البيانات الأولية.نقوم بتجربة طريقتين لاكتشاف القواعد.نهجنا الأول يولد بادئة وتحولات لاحقة بين سلاسل مماثلة.ثانيا، نحن نقوم بتجربة المزيد من القواعد العامة التي يمكن أن تطبق التحولات داخل سلاسل الإدخال بالإضافة إلى التحويلات البادئة واللاحقة.نجد أن أفضل أداء إجمالي يتم تسليمها عن طريق البادئة وقواعد اللاحقة ولكن المزيد من قواعد التحول العامة تؤدي أفضل لغات مع التشكل الغزيرة ونباتات مورفيم إلى كلمة عالية للغاية.
This paper describes the submission of the CU-UBC team for the SIGMORPHON 2021 Shared Task 2: Unsupervised morphological paradigm clustering. Our system generates paradigms using morphological transformation rules which are discovered from raw data. We experiment with two methods for discovering rules. Our first approach generates prefix and suffix transformations between similar strings. Secondly, we experiment with more general rules which can apply transformations inside the input strings in addition to prefix and suffix transformations. We find that the best overall performance is delivered by prefix and suffix rules but more general transformation rules perform better for languages with templatic morphology and very high morpheme-to-word ratios.
References used
https://aclanthology.org/
This paper presents two different systems for unsupervised clustering of morphological paradigms, in the context of the SIGMORPHON 2021 Shared Task 2. The goal of this task is to correctly cluster words in a given language by their inflectional parad
This work describes the Edinburgh submission to the SIGMORPHON 2021 Shared Task 2 on unsupervised morphological paradigm clustering. Given raw text input, the task was to assign each token to a cluster with other tokens from the same paradigm. We use
Machine translation usually relies on parallel corpora to provide parallel signals for training. The advent of unsupervised machine translation has brought machine translation away from this reliance, though performance still lags behind traditional
We describe the second SIGMORPHON shared task on unsupervised morphology: the goal of the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering is to cluster word types from a raw text corpus into paradigms. To this end, we re
This paper describes our system for the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering, which asks participants to group inflected forms together according their underlying lemma without the aid of annotated training da