Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Smoothing and Shrinking the Sparse Seq2Seq Search Space

تجانس وتقليص مساحة البحث SEQ2SEQ Sparse

474 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

search space shrinking the sparse smoothing and shrinking مساحة البحث تقلص المتفرق تجانس وتقلص صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

يتم تدريب نماذج التسلسل الحالية للتسلسل لتقليل الانتروبي عبر الانتروبيا واستخدام SoftMax لحساب الاحتمالات العادية محليا على تسلسلات الهدف. على الرغم من أن هذا الإعداد قد أدى إلى نتائج قوية في مجموعة متنوعة من المهام، فإن إحدى الجوانب غير المرضية هي التحيز الطول: تمنح النماذج درجات عالية لفرضيات قصيرة وعدم كفاية وغالبا ما تجعل السلسلة الفارغة The Argmax --- ما يسمى القط حصلت على لسانك مشكلة. تقدم نماذج تسلسل متناشرة مقرها ENTMAX مؤخرا حلا محتملا، نظرا لأنهم يستطيعون تقليص مساحة البحث عن طريق تعيين احتمال صفر لفرضيات سيئة، ولكن قدرتهم على التعامل مع المهام على مستوى الكلمات مع المحولات قد تم اختبارها قط. في هذا العمل، نظهر أن النماذج المستندة إلى Entmax تحل فعليا القط حصلت على مشكلة لسانك، وإزالة مصدر رئيسي لخطأ نموذج الترجمة الآلية العصبية. بالإضافة إلى ذلك، نعيد بتعميم تجانس الملصقات، وهي تقنية تنظيمية حاسمة، إلى عائلة أوسع من الخسائر الشابة الشابة، والتي تشمل كل من انتروبيا وخسائر Entmax. وضعت نماذج خسارة Entmax الناتجة عن الملصقات الناتجة حالة جديدة من الفن على تحويل Grapheme-Vooneme في Grapheme وتقديم التحسينات وخصائص معايرة أفضل على الانعطاف المورفولوجي عبر اللغات والترجمة الآلية لمدة 7 أزواج لغة.

Current sequence-to-sequence models are trained to minimize cross-entropy and use softmax to compute the locally normalized probabilities over target sequences. While this setup has led to strong results in a variety of tasks, one unsatisfying aspect is its length bias: models give high scores to short, inadequate hypotheses and often make the empty string the argmax---the so-called cat got your tongue problem. Recently proposed entmax-based sparse sequence-to-sequence models present a possible solution, since they can shrink the search space by assigning zero probability to bad hypotheses, but their ability to handle word-level tasks with transformers has never been tested. In this work, we show that entmax-based models effectively solve the cat got your tongue problem, removing a major source of model error for neural machine translation. In addition, we generalize label smoothing, a critical regularization technique, to the broader family of Fenchel-Young losses, which includes both cross-entropy and the entmax losses. Our resulting label-smoothed entmax loss models set a new state of the art on multilingual grapheme-to-phoneme conversion and deliver improvements and better calibration properties on cross-lingual morphological inflection and machine translation for 7 language pairs.

References used

https://aclanthology.org/

rate research

SPECTRA: Sparse Structured Text Rationalization

230 - Association for Computation Linguistics 2021 مقالة

Selective rationalization aims to produce decisions along with rationales (e.g., text highlights or word alignments between two sentences). Commonly, rationales are modeled as stochastic binary masks, requiring sampling-based gradient estimators, whi ch complicates training and requires careful hyperparameter tuning. Sparse attention mechanisms are a deterministic alternative, but they lack a way to regularize the rationale extraction (e.g., to control the sparsity of a text highlight or the number of alignments). In this paper, we present a unified framework for deterministic extraction of structured explanations via constrained inference on a factor graph, forming a differentiable layer. Our approach greatly eases training and rationale regularization, generally outperforming previous work on what comes to performance and plausibility of the extracted rationales. We further provide a comparative study of stochastic and deterministic methods for rationale extraction for classification and natural language inference tasks, jointly assessing their predictive power, quality of the explanations, and model variability.

structured text rationalization sparse structured text selective rationalization aims ترشيد النص المنظم النص المنظم متفرق أهداف الترشيد الانتقائي صناعة حمض الفوسفور المزيد..

Sequential Randomized Smoothing for Adversarially Robust Speech Recognition

506 - Association for Computation Linguistics 2021 مقالة

While Automatic Speech Recognition has been shown to be vulnerable to adversarial attacks, defenses against these attacks are still lagging. Existing, naive defenses can be partially broken with an adaptive attack. In classification tasks, the Random ized Smoothing paradigm has been shown to be effective at defending models. However, it is difficult to apply this paradigm to ASR tasks, due to their complexity and the sequential nature of their outputs. Our paper overcomes some of these challenges by leveraging speech-specific tools like enhancement and ROVER voting to design an ASR model that is robust to perturbations. We apply adaptive versions of state-of-the-art attacks, such as the Imperceptible ASR attack, to our model, and show that our strongest defense is robust to all attacks that use inaudible noise, and can only be broken with very high distortion.

تحسين مدرب مسبقا adversarially robust speech robust speech recognition اعتراف خطاب قوي صناعة حمض الفوسفور

BioCopy: A Plug-And-Play Span Copy Mechanism in Seq2Seq Models

288 - Association for Computation Linguistics 2021 مقالة

Copy mechanisms explicitly obtain unchanged tokens from the source (input) sequence to generate the target (output) sequence under the neural seq2seq framework. However, most of the existing copy mechanisms only consider single word copying from the source sentences, which results in losing essential tokens while copying long spans. In this work, we propose a plug-and-play architecture, namely BioCopy, to alleviate the problem aforementioned. Specifically, in the training stage, we construct a BIO tag for each token and train the original model with BIO tags jointly. In the inference stage, the model will firstly predict the BIO tag at each time step, then conduct different mask strategies based on the predicted BIO label to diminish the scope of the probability distributions over the vocabulary list. Experimental results on two separate generative tasks show that they all outperform the baseline models by adding our BioCopy to the original model structure.

span copy mechanism copy mechanisms explicitly copy mechanisms سبان نسخة آلية نسخ آليات صراحة نسخ آليات صناعة حمض الفوسفور المزيد..

Profanity-Avoiding Training Framework for Seq2seq Models with Certified Robustness

428 - Association for Computation Linguistics 2021 مقالة

Seq2seq models have demonstrated their incredible effectiveness in a large variety of applications. However, recent research has shown that inappropriate language in training samples and well-designed testing cases can induce seq2seq models to output profanity. These outputs may potentially hurt the usability of seq2seq models and make the end-users feel offended. To address this problem, we propose a training framework with certified robustness to eliminate the causes that trigger the generation of profanity. The proposed training framework leverages merely a short list of profanity examples to prevent seq2seq models from generating a broader spectrum of profanity. The framework is composed of a pattern-eliminating training component to suppress the impact of language patterns with profanity in the training set, and a trigger-resisting training component to provide certified robustness for seq2seq models against intentionally injected profanity-triggering expressions in test samples. In the experiments, we consider two representative NLP tasks that seq2seq can be applied to, i.e., style transfer and dialogue generation. Extensive experimental results show that the proposed training framework can successfully prevent the NLP models from generating profanity.

profanity-avoiding training framework training framework إطار التدريب على الألفاظ النابية إطار التدريب صناعة حمض الفوسفور

Hybrid Tabu Search And Guided Local Search And Existence 2-Opt Local Search To Contribute In Solving The Vehicle Routing Problem With Time Windows

1715 - Tishreen University 2017 ورقة بحثية

In this research, we are studying the possibility of contribution in solving the Vehicle Routing Problem with Time Windows(VRPTW),that is one of the optimization problems of the NP-hard type. Moreover, Hybrid algorithm (HA) provided that integrate s between Tabu Search Algorithm and Guided Local Search algorithm And existence 2- Opt Local Search, based on the savings algorithm in terms of continued of a particular objective to provide a lot of savings. As we will compare the presented approach with standard tests to demonstrate the efficiency, and their impact on the quality of the solution in terms of speed of convergence and the ability to find better solutions.

مسألة توجيه المركبة مع نوافذ الزمن Vehicle Routing Problem With Time Windows خوارزمية البحث المحظور خوارزمية البحث المحلي الموجه Guided Local Search خوارزمية التوفير Savings Algorithm Algorithm Tabu Search Algorithm المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Smoothing and Shrinking the Sparse Seq2Seq Search Space

تجانس وتقليص مساحة البحث SEQ2SEQ Sparse

Ask ChatGPT about the research

Read More

suggested questions