بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

STACL: Simultaneous Translation with Implicit Anticipation and Controllable Latency using Prefix-to-Prefix Framework

69 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Mingbo Ma

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Mingbo Ma - Liang Huang - Hao Xiong

الحساب واللغة

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Simultaneous translation, which translates sentences before they are finished, is useful in many scenarios but is notoriously difficult due to word-order differences. While the conventional seq-to-seq framework is only suitable for full-sentence translation, we propose a novel prefix-to-prefix framework for simultaneous translation that implicitly learns to anticipate in a single translation model. Within this framework, we present a very simple yet surprisingly effective wait-k policy trained to generate the target sentence concurrently with the source sentence, but always k words behind. Experiments show our strategy achieves low latency and reasonable quality (compared to full-sentence translation) on 4 directions: zh<->en and de<->en.

قيم البحث

292 - Mingbo Ma , Baigong Zheng , Kaibo Liu 2019

Text-to-speech synthesis (TTS) has witnessed rapid progress in recent years, where neural methods became capable of producing audios with high naturalness. However, these efforts still suffer from two types of latencies: (a) the {em computational lat ency} (synthesizing time), which grows linearly with the sentence length even with parallel approaches, and (b) the {em input latency} in scenarios where the input text is incrementally generated (such as in simultaneous translation, dialog generation, and assistive technologies). To reduce these latencies, we devise the first neural incremental TTS approach based on the recently proposed prefix-to-prefix framework. We synthesize speech in an online fashion, playing a segment of audio while generating the next, resulting in an $O(1)$ rather than $O(n)$ latency.

الحساب واللغة أنظمة الصوت في الحاسوب معالجة الصوت والكلام

Fluent and Low-latency Simultaneous Speech-to-Speech Translation with Self-adaptive Training

69 - Renjie Zheng , Mingbo Ma , Baigong Zheng 2020

Simultaneous speech-to-speech translation is widely useful but extremely challenging, since it needs to generate target-language speech concurrently with the source-language speech, with only a few seconds delay. In addition, it needs to continuously translate a stream of sentences, but all recent solutions merely focus on the single-sentence scenario. As a result, current approaches accumulate latencies progressively when the speaker talks faster, and introduce unnatural pauses when the speaker talks slower. To overcome these issues, we propose Self-Adaptive Translation (SAT) which flexibly adjusts the length of translations to accommodate different source speech rates. At similar levels of translation quality (as measured by BLEU), our method generates more fluent target speech (as measured by the naturalness metric MOS) with substantially lower latency than the baseline, in both Zh <-> En directions.

الحساب واللغة الذكاء الاصطناعي

Reserved-Length Prefix Coding

376 - Michael B. Baer 2007

Huffman coding finds an optimal prefix code for a given probability mass function. Consider situations in which one wishes to find an optimal code with the restriction that all codewords have lengths that lie in a user-specified set of lengths (or, e quivalently, no codewords have lengths that lie in a complementary set). This paper introduces a polynomial-time dynamic programming algorithm that finds optimal codes for this reserved-length prefix coding problem. This has applications to quickly encoding and decoding lossless codes. In addition, one modification of the approach solves any quasiarithmetic prefix coding problem, while another finds optimal codes restricted to the set of codes with g codeword lengths for user-specified g (e.g., g=2).

نظرية المعلومات بنى وهياكل البيانات والخوارزميات نظرية المعلومات

Weakened Random Oracle Models with Target Prefix

63 - Masayuki Tezuka , Yusuke Yoshida , Keisuke Tanaka 2021

Weakened random oracle models (WROMs) are variants of the random oracle model (ROM). The WROMs have the random oracle and the additional oracle which breaks some property of a hash function. Analyzing the security of cryptographic schemes in WROMs, w e can specify the property of a hash function on which the security of cryptographic schemes depends. Liskov (SAC 2006) proposed WROMs and later Numayama et al. (PKC 2008) formalized them as CT-ROM, SPT-ROM, and FPT-ROM. In each model, there is the additional oracle to break collision resistance, second preimage resistance, preimage resistance respectively. Tan and Wong (ACISP 2012) proposed the generalized FPT-ROM (GFPT-ROM) which intended to capture the chosen prefix collision attack suggested by Stevens et al. (EUROCRYPT 2007). In this paper, in order to analyze the security of cryptographic schemes more precisely, we formalize GFPT-ROM and propose additional three WROMs which capture the chosen prefix collision attack and its variants. In particular, we focus on signature schemes such as RSA-FDH, its variants, and DSA, in order to understand essential roles of WROMs in their security proofs.

التشفير والأمن

SimulMT to SimulST: Adapting Simultaneous Text Translation to End-to-End Simultaneous Speech Translation

127 - Xutai Ma , Juan Pino , Philipp Koehn 2020

Simultaneous text translation and end-to-end speech translation have recently made great progress but little work has combined these tasks together. We investigate how to adapt simultaneous text translation methods such as wait-k and monotonic multih ead attention to end-to-end simultaneous speech translation by introducing a pre-decision module. A detailed analysis is provided on the latency-quality trade-offs of combining fixed and flexible pre-decision with fixed and flexible policies. We also design a novel computation-aware latency metric, adapted from Average Lagging.

الحساب واللغة

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

الجامعة الافتراضية السورية

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

STACL: Simultaneous Translation with Implicit Anticipation and Controllable Latency using Prefix-to-Prefix Framework

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً