New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Simple induction of (deterministic) probabilistic finite-state automata for phonotactics by stochastic gradient descent

تحريض بسيط من (الحتمية) أتمتة الحالة المحدودة الواحد للاتحاد الفوني من خلال نزول التدرج الاستوكاستك

239 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We introduce a simple and highly general phonotactic learner which induces a probabilistic finite-state automaton from word-form data. We describe the learner and show how to parameterize it to induce unrestricted regular languages, as well as how to restrict it to certain subregular classes such as Strictly k-Local and Strictly k-Piecewise languages. We evaluate the learner on its ability to learn phonotactic constraints in toy examples and in datasets of Quechua and Navajo. We find that an unrestricted learner is the most accurate overall when modeling attested forms not seen in training; however, only the learner restricted to the Strictly Piecewise language class successfully captures certain nonlocal phonotactic constraints. Our learner serves as a baseline for more sophisticated methods.

References used

https://aclanthology.org/

rate research

Finite-state Model of Shupamem Reduplication

441 - Association for Computation Linguistics 2021 مقالة

Shupamem, a language of Western Cameroon, is a tonal language which also exhibits the morpho-phonological process of full reduplication. This creates two challenges for finite-state model of its morpho-syntax and morphophonology: how to manage the fu ll reduplication and the autosegmental nature of lexical tone. Dolatian and Heinz (2020) explain how 2-way finite-state transducers can model full reduplication without an exponential increase in states, and finite-state transducers with multiple tapes have been used to model autosegmental tiers, including tone (Wiebe, 1992; Dolatian and Rawski, 2020a). Here we synthesize 2-way finite-state transducers and multitape transducers, resulting in a finite-state formalism that subsumes both, to account for the full reduplicative processes in Shupamem which also affect tone.

western cameroon full reduplication reduplication الكاميرون الغربي إعادة التدوير الكامل reduplication. صناعة حمض الفوسفور المزيد..

Recognizing Reduplicated Forms: Finite-State Buffered Machines

496 - Association for Computation Linguistics 2021 مقالة

Total reduplication is common in natural language phonology and morphology. However, formally as copying on reduplicants of unbounded size, unrestricted total reduplication requires computational power beyond context-free, while other phonological an d morphological patterns are regular, or even sub-regular. Thus, existing language classes characterizing reduplicated strings inevitably include typologically unattested context-free patterns, such as reversals. This paper extends regular languages to incorporate reduplication by introducing a new computational device: finite state buffered machine (FSBMs). We give its mathematical definitions and discuss some closure properties of the corresponding set of languages. As a result, the class of regular languages and languages derived from them through a copying mechanism is characterized. Suggested by previous literature, this class of languages should approach the characterization of natural language word sets.

recognizing reduplicated forms finite-state buffered machines reduplicated forms التعرف على النماذج المكررة آلات مخزنة من الدولة المحدودة أشكال مكررة صناعة حمض الفوسفور المزيد..

GAML-BERT: Improving BERT Early Exiting by Gradient Aligned Mutual Learning

355 - Association for Computation Linguistics 2021 مقالة

In this work, we propose a novel framework, Gradient Aligned Mutual Learning BERT (GAML-BERT), for improving the early exiting of BERT. GAML-BERT's contributions are two-fold. We conduct a set of pilot experiments, which shows that mutual knowledge d istillation between a shallow exit and a deep exit leads to better performances for both. From this observation, we use mutual learning to improve BERT's early exiting performances, that is, we ask each exit of a multi-exit BERT to distill knowledge from each other. Second, we propose GA, a novel training method that aligns the gradients from knowledge distillation to cross-entropy losses. Extensive experiments are conducted on the GLUE benchmark, which shows that our GAML-BERT can significantly outperform the state-of-the-art (SOTA) BERT early exiting methods.

aligned mutual learning gradient aligned mutual bert early exiting محاذاة التعلم المتبادل التدرج المحاذاة المتبادلة بيرت المبكر الخروج صناعة حمض الفوسفور المزيد..

Gradient-Based Adversarial Factual Consistency Evaluation for Abstractive Summarization

317 - Association for Computation Linguistics 2021 مقالة

Neural abstractive summarization systems have gained significant progress in recent years. However, abstractive summarization often produce inconsisitent statements or false facts. How to automatically generate highly abstract yet factually correct s ummaries? In this paper, we proposed an efficient weak-supervised adversarial data augmentation approach to form the factual consistency dataset. Based on the artificial dataset, we train an evaluation model that can not only make accurate and robust factual consistency discrimination but is also capable of making interpretable factual errors tracing by backpropagated gradient distribution on token embeddings. Experiments and analysis conduct on public annotated summarization and factual consistency datasets demonstrate our approach effective and reasonable.

شبكة النسخ المصقول gradient-based adversarial factual neural abstractive summarization الواقعي المقصود المستند إلى التدرج تلخيص المبشور العصبي صناعة حمض الفوسفور

Comparative Error Analysis in Neural and Finite-state Models for Unsupervised Character-level Transduction

298 - Association for Computation Linguistics 2021 مقالة

Traditionally, character-level transduction problems have been solved with finite-state models designed to encode structural and linguistic knowledge of the underlying process, whereas recent approaches rely on the power and flexibility of sequence-t o-sequence models with attention. Focusing on the less explored unsupervised learning scenario, we compare the two model classes side by side and find that they tend to make different types of errors even when achieving comparable performance. We analyze the distributions of different error classes using two unsupervised tasks as testbeds: converting informally romanized text into the native script of its language (for Russian, Arabic, and Kannada) and translating between a pair of closely related languages (Serbian and Bosnian). Finally, we investigate how combining finite-state and sequence-to-sequence models at decoding time affects the output quantitatively and qualitatively.

comparative error analysis analysis in neural unsupervised character-level transduction تحليل الأخطاء المقارنة تحليل في العصابة نقل مستوى الطابع غير المنشأ صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Simple induction of (deterministic) probabilistic finite-state automata for phonotactics by stochastic gradient descent

تحريض بسيط من (الحتمية) أتمتة الحالة المحدودة الواحد للاتحاد الفوني من خلال نزول التدرج الاستوكاستك

Ask ChatGPT about the research

Read More

suggested questions