New community

Subscribe to the gold package and get unlimited access to Shamra Academy

On the Difficulty of Segmenting Words with Attention

في صعوبة تجزئة الكلمات مع الاهتمام

267 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

difficulty of segmenting segmenting words difficulty صعوبة تجزئة تجزئة الكلمات صعوبة صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Word segmentation, the problem of finding word boundaries in speech, is of interest for a range of tasks. Previous papers have suggested that for sequence-to-sequence models trained on tasks such as speech translation or speech recognition, attention can be used to locate and segment the words. We show, however, that even on monolingual data this approach is brittle. In our experiments with different input types, data sizes, and segmentation algorithms, only models trained to predict phones from words succeed in the task. Models trained to predict words from either phones or speech (i.e., the opposite direction needed to generalize to new data), yield much worse results, suggesting that attention-based segmentation is only useful in limited scenarios.

References used

https://aclanthology.org/

rate research

Grouping Words with Semantic Diversity

389 - Association for Computation Linguistics 2021 مقالة

Deep Learning-based NLP systems can be sensitive to unseen tokens and hard to learn with high-dimensional inputs, which critically hinder learning generalization. We introduce an approach by grouping input words based on their semantic diversity to s implify input language representation with low ambiguity. Since the semantically diverse words reside in different contexts, we are able to substitute words with their groups and still distinguish word meanings relying on their contexts. We design several algorithms that compute diverse groupings based on random sampling, geometric distances, and entropy maximization, and we prove formal guarantees for the entropy-based algorithms. Experimental results show that our methods generalize NLP models and demonstrate enhanced accuracy on POS tagging and LM tasks and significant improvements on medium-scale machine translation tasks, up to +6.5 BLEU points. Our source code is available at https://github.com/abdulrafae/dg.

semantic diversity deep learning-based nlp learning-based nlp systems التنوع الدلالي التعلم العميق القائم على NLP أنظمة NLP القائمة على التعلم صناعة حمض الفوسفور المزيد..

Attention Weights in Transformer NMT Fail Aligning Words Between Sequences but Largely Explain Model Predictions

208 - Association for Computation Linguistics 2021 مقالة

This work proposes an extensive analysis of the Transformer architecture in the Neural Machine Translation (NMT) setting. Focusing on the encoder-decoder attention mechanism, we prove that attention weights systematically make alignment errors by rel ying mainly on uninformative tokens from the source sequence. However, we observe that NMT models assign attention to these tokens to regulate the contribution in the prediction of the two contexts, the source and the prefix of the target sequence. We provide evidence about the influence of wrong alignments on the model behavior, demonstrating that the encoder-decoder attention mechanism is well suited as an interpretability method for NMT. Finally, based on our analysis, we propose methods that largely reduce the word alignment error rate compared to standard induced alignments from attention weights.

nmt fail aligning fail aligning words transformer nmt fail NMT تفشل محاذاة تفشل محاذاة الكلمات محول NMT فشل صناعة حمض الفوسفور المزيد..

Weakly Supervised Extractive Summarization with Attention

352 - Association for Computation Linguistics 2021 مقالة

Automatic summarization aims to extract important information from large amounts of textual data in order to create a shorter version of the original texts while preserving its information. Training traditional extractive summarization models relies heavily on human-engineered labels such as sentence-level annotations of summary-worthiness. However, in many use cases, such human-engineered labels do not exist and manually annotating thousands of documents for the purpose of training models may not be feasible. On the other hand, indirect signals for summarization are often available, such as agent actions for customer service dialogues, headlines for news articles, diagnosis for Electronic Health Records, etc. In this paper, we develop a general framework that generates extractive summarization as a byproduct of supervised learning tasks for indirect signals via the help of attention mechanism. We test our models on customer service dialogues and experimental results demonstrated that our models can reliably select informative sentences and words for automatic summarization.

weakly supervised extractive extractive summarization electronic health records الإشراف ضعيف الاستخراج تلخيص الاستخراج سجلات الصحة الإلكترونية صناعة حمض الفوسفور المزيد..

On the Difficulty of Translating Free-Order Case-Marking Languages

640 - Association for Computation Linguistics 2021 مقالة

Abstract Identifying factors that make certain languages harder to model than others is essential to reach language equality in future Natural Language Processing technologies. Free-order case-marking languages, such as Russian, Latin, or Tamil, have proved more challenging than fixed-order languages for the tasks of syntactic parsing and subject-verb agreement prediction. In this work, we investigate whether this class of languages is also more difficult to translate by state-of-the-art Neural Machine Translation (NMT) models. Using a variety of synthetic languages and a newly introduced translation challenge set, we find that word order flexibility in the source language only leads to a very small loss of NMT quality, even though the core verb arguments become impossible to disambiguate in sentences without semantic cues. The latter issue is indeed solved by the addition of case marking. However, in medium- and low-resource settings, the overall NMT quality of fixed-order languages remains unmatched.

difficulty of translating translating free-order case-marking صعوبة الترجمة ترجمة الحرة ترميز العلامات صناعة حمض الفوسفور

Character-based Thai Word Segmentation with Multiple Attentions

461 - Association for Computation Linguistics 2021 مقالة

Character-based word-segmentation models have been extensively applied to agglutinative languages, including Thai, due to their high performance. These models estimate word boundaries from a character sequence. However, a character unit in sequences has no essential meaning, compared with word, subword, and character cluster units. We propose a Thai word-segmentation model that uses various types of information, including words, subwords, and character clusters, from a character sequence. Our model applies multiple attentions to refine segmentation inferences by estimating the significant relationships among characters and various unit types. The experimental results indicate that our model can outperform other state-of-the-art Thai word-segmentation models.

character-based thai word thai thai word-segmentation الكلمة التايلاندية القائمة على الأحرف التايلاندية التايلاندية تجزئة صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

On the Difficulty of Segmenting Words with Attention

في صعوبة تجزئة الكلمات مع الاهتمام

Ask ChatGPT about the research

Read More

suggested questions