Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Adaptor Grammars for Unsupervised Paradigm Clustering

محول النحو على تجميع النموذج غير الخاضع

541 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This work describes the Edinburgh submission to the SIGMORPHON 2021 Shared Task 2 on unsupervised morphological paradigm clustering. Given raw text input, the task was to assign each token to a cluster with other tokens from the same paradigm. We use Adaptor Grammar segmentations combined with frequency-based heuristics to predict paradigm clusters. Our system achieved the highest average F1 score across 9 test languages, placing first out of 15 submissions.

References used

https://aclanthology.org/

rate research

Findings of the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering

724 - Association for Computation Linguistics 2021 مقالة

We describe the second SIGMORPHON shared task on unsupervised morphology: the goal of the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering is to cluster word types from a raw text corpus into paradigms. To this end, we re lease corpora for 5 development and 9 test languages, as well as gold partial paradigms for evaluation. We receive 14 submissions from 4 teams that follow different strategies, and the best performing system is based on adaptor grammars. Results vary significantly across languages. However, all systems are outperformed by a supervised lemmatizer, implying that there is still room for improvement.

unsupervised morphological paradigm morphological paradigm clustering sigmorphon shared task النموذج المورفولوجي غير المدخري تجميع النماذج المورفولوجية Sigmorphon المهمة المشتركة صناعة حمض الفوسفور المزيد..

Unsupervised Paradigm Clustering Using Transformation Rules

557 - Association for Computation Linguistics 2021 مقالة

This paper describes the submission of the CU-UBC team for the SIGMORPHON 2021 Shared Task 2: Unsupervised morphological paradigm clustering. Our system generates paradigms using morphological transformation rules which are discovered from raw data. We experiment with two methods for discovering rules. Our first approach generates prefix and suffix transformations between similar strings. Secondly, we experiment with more general rules which can apply transformations inside the input strings in addition to prefix and suffix transformations. We find that the best overall performance is delivered by prefix and suffix rules but more general transformation rules perform better for languages with templatic morphology and very high morpheme-to-word ratios.

نموذج مورفولوجي صناعة حمض الفوسفور

Paradigm Clustering with Weighted Edit Distance

837 - Association for Computation Linguistics 2021 مقالة

This paper describes our system for the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering, which asks participants to group inflected forms together according their underlying lemma without the aid of annotated training da ta. We employ agglomerative clustering to group word forms together using a metric that combines an orthographic distance and a semantic distance from word embeddings. We experiment with two variations of an edit distance-based model for quantifying orthographic distance, but, due to time constraints, our system does not improve over the shared task's baseline system.

نموذج مورفولوجي weighted edit distance تعديل المسافة المرجحة صناعة حمض الفوسفور

Orthographic vs. Semantic Representations for Unsupervised Morphological Paradigm Clustering

547 - Association for Computation Linguistics 2021 مقالة

This paper presents two different systems for unsupervised clustering of morphological paradigms, in the context of the SIGMORPHON 2021 Shared Task 2. The goal of this task is to correctly cluster words in a given language by their inflectional parad igm, without any previous knowledge of the language and without supervision from labeled data of any sort. The words in a single morphological paradigm are different inflectional variants of an underlying lemma, meaning that the words share a common core meaning. They also - usually - show a high degree of orthographical similarity. Following these intuitions, we investigate KMeans clustering using two different types of word representations: one focusing on orthographical similarity and the other focusing on semantic similarity.Additionally, we discuss the merits of randomly initialized centroids versus pre-defined centroids for clustering. Pre-defined centroids are identified based on either a standard longest common substring algorithm or a connected graph method built off of longest common substring. For all development languages, the character-based embeddings perform similarly to the baseline, and the semantic embeddings perform well below the baseline.Analysis of the systems' errors suggests that clustering based on orthographic representations is suitable for a wide range of morphological mechanisms, particularly as part of a larger system.

تجميع النموذج morphological paradigm نموذج مورفولوجي صناعة حمض الفوسفور

Task-Oriented Clustering for Dialogues

1108 - Association for Computation Linguistics 2021 مقالة

A reliable clustering algorithm for task-oriented dialogues can help developer analysis and define dialogue tasks efficiently. It is challenging to directly apply prior normal text clustering algorithms for task-oriented dialogues, due to the inheren t differences between them, such as coreference, omission and diversity expression. In this paper, we propose a Dialogue Task Clustering Network model for task-oriented clustering. The proposed model combines context-aware utterance representations and cross-dialogue utterance cluster representations for task-oriented dialogues clustering. An iterative end-to-end training strategy is utilized for dialogue clustering and representation learning jointly. Experiments on three public datasets show that our model significantly outperform strong baselines in all metrics.

إعادة التفكير في صفر النار العصبية task-oriented dialogues clustering تجميع الحوارات الموجهة نحو المهام تجمع صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Adaptor Grammars for Unsupervised Paradigm Clustering

محول النحو على تجميع النموذج غير الخاضع

Ask ChatGPT about the research

Read More

suggested questions