Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

BME Submission for SIGMORPHON 2021 Shared Task 0. A Three Step Training Approach with Data Augmentation for Morphological Inflection

التقديم BME ل SIGMORPHON 2021 المهمة المشتركة 0. نهج تدريب ثلاثي خطوات مع تكبير البيانات للانضباط المورفولوجي

587 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

انعطاف مورفولوجي bme submission BME التقديم صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We present the BME submission for the SIGMORPHON 2021 Task 0 Part 1, Generalization Across Typologically Diverse Languages shared task. We use an LSTM encoder-decoder model with three step training that is first trained on all languages, then fine-tuned on each language family and finally fine-tuned on individual languages. We use a different type of data augmentation technique in the first two steps. Our system outperformed the only other submission. Although it remains worse than the Transformer baseline released by the organizers, our model is simpler and our data augmentation techniques are easily applicable to new languages. We perform ablation studies and show that the augmentation techniques and the three training steps often help but sometimes have a negative effect. Our code is publicly available.

References used

https://aclanthology.org/

rate research

Findings of the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering

748 - Association for Computation Linguistics 2021 مقالة

We describe the second SIGMORPHON shared task on unsupervised morphology: the goal of the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering is to cluster word types from a raw text corpus into paradigms. To this end, we re lease corpora for 5 development and 9 test languages, as well as gold partial paradigms for evaluation. We receive 14 submissions from 4 teams that follow different strategies, and the best performing system is based on adaptor grammars. Results vary significantly across languages. However, all systems are outperformed by a supervised lemmatizer, implying that there is still room for improvement.

unsupervised morphological paradigm morphological paradigm clustering sigmorphon shared task النموذج المورفولوجي غير المدخري تجميع النماذج المورفولوجية Sigmorphon المهمة المشتركة صناعة حمض الفوسفور المزيد..

Were We There Already? Applying Minimal Generalization to the SIGMORPHON-UniMorph Shared Task on Cognitively Plausible Morphological Inflection

645 - Association for Computation Linguistics 2021 مقالة

Morphological rules with various levels of specificity can be learned from example lexemes by recursive application of minimal generalization (Albright and Hayes, 2002, 2003).A model that learns rules solely through minimal generalization was used to predict average human wug-test ratings from German, English, and Dutch in the SIGMORPHON-UniMorph 2021 Shared Task, with competitive results. Some formal properties of the minimal generalization operation were proved. An automatic method was developed to create wug-test stimuli for future experiments that investigate whether the model's morphological generalizations are too minimal.

plausible morphological inflection cognitively plausible morphological انعطاف مورفولوجي المعقول المورفولوجية المعقولة المعلنة صناعة حمض الفوسفور

IST-Unbabel 2021 Submission for the Quality Estimation Shared Task

764 - Association for Computation Linguistics 2021 مقالة

We present the joint contribution of IST and Unbabel to the WMT 2021 Shared Task on Quality Estimation. Our team participated on two tasks: Direct Assessment and Post-Editing Effort, encompassing a total of 35 submissions. For all submissions, our ef forts focused on training multilingual models on top of OpenKiwi predictor-estimator architecture, using pre-trained multilingual encoders combined with adapters. We further experiment with and uncertainty-related objectives and features as well as training on out-of-domain direct assessment data.

الاستغلال المباشر صناعة حمض الفوسفور

CLUZH at SIGMORPHON 2021 Shared Task on Multilingual Grapheme-to-Phoneme Conversion: Variations on a Baseline

671 - Association for Computation Linguistics 2021 مقالة

This paper describes the submission by the team from the Department of Computational Linguistics, Zurich University, to the Multilingual Grapheme-to-Phoneme Conversion (G2P) Task 1 of the SIGMORPHON 2021 challenge in the low and medium settings. The submission is a variation of our 2020 G2P system, which serves as the baseline for this year's challenge. The system is a neural transducer that operates over explicit edit actions and is trained with imitation learning. For this challenge, we experimented with the following changes: a) emitting phoneme segments instead of single character phonemes, b) input character dropout, c) a mogrifier LSTM decoder (Melis et al., 2019), d) enriching the decoder input with the currently attended input character, e) parallel BiLSTM encoders, and f) an adaptive batch size scheduler. In the low setting, our best ensemble improved over the baseline, however, in the medium setting, the baseline was stronger on average, although for certain languages improvements could be observed.

فرقة cluzh at sigmorphon zurich university cluzh في سيغمورفون جامعة زيوريخ صناعة حمض الفوسفور

NLPIITR at SemEval-2021 Task 6: RoBERTa Model with Data Augmentation for Persuasion Techniques Detection

697 - Association for Computation Linguistics 2021 مقالة

This paper describes and examines different systems to address Task 6 of SemEval-2021: Detection of Persuasion Techniques In Texts And Images, Subtask 1. The task aims to build a model for identifying rhetorical and psycho- logical techniques (such a s causal oversimplification, name-calling, smear) in the textual content of a meme which is often used in a disinformation campaign to influence the users. The paper provides an extensive comparison among various machine learning systems as a solution to the task. We elaborate on the pre-processing of the text data in favor of the task and present ways to overcome the class imbalance. The results show that fine-tuning a RoBERTa model gave the best results with an F1-Micro score of 0.51 on the development set.

persuasion techniques detection augmentation for persuasion تكشف تقنيات الإقناع زيادة للإقناع صناعة حمض الفوسفور

BME Submission for SIGMORPHON 2021 Shared Task 0. A Three Step Training Approach with Data Augmentation for Morphological Inflection

التقديم BME ل SIGMORPHON 2021 المهمة المشتركة 0. نهج تدريب ثلاثي خطوات مع تكبير البيانات للانضباط المورفولوجي

Ask ChatGPT about the research

Read More

suggested questions