تصف هذه الورقة النظام المقدم إلى مهمة ترجمة الكلام متعددة اللغات IWSLT 2021 من مختبر Ark في Huawei Noah.نحن نستخدم بنية محولات موحدة لطرازنا متعدد الأموال، بحيث يمكن استغلال البيانات من الطرائق المختلفة (I.E.على وجه التحديد، تتغذى مدخلات الكلام والنص النصي أولا لمصول ميزة مختلفة لاستخراج الميزات الصوتية والنصية على التوالي.بعد ذلك، تتم معالجة هذه الميزات بواسطة بنية فك تشفير مشتركة.نحن نطبق العديد من تقنيات التدريب لتحسين الأداء، بما في ذلك التعلم متعدد المهام، وتعزيز المناهج الدراسية على مستوى المهام، وتعزيز البيانات، وما إلى ذلك. يحقق نظامنا النهائي نتائج أفضل بكثير من خطوط الأساس ثنائي اللغة على أزواج اللغة الخاضعة للإشراف ونتائج نتائج معقولة على لغة طلقة صفريةأزواج.
This paper describes the system submitted to the IWSLT 2021 Multilingual Speech Translation (MultiST) task from Huawei Noah's Ark Lab. We use a unified transformer architecture for our MultiST model, so that the data from different modalities (i.e., speech and text) and different tasks (i.e., Speech Recognition, Machine Translation, and Speech Translation) can be exploited to enhance the model's ability. Specifically, speech and text inputs are firstly fed to different feature extractors to extract acoustic and textual features, respectively. Then, these features are processed by a shared encoder--decoder architecture. We apply several training techniques to improve the performance, including multi-task learning, task-level curriculum learning, data augmentation, etc. Our final system achieves significantly better results than bilingual baselines on supervised language pairs and yields reasonable results on zero-shot language pairs.
References used
https://aclanthology.org/
This paper describes Maastricht University's participation in the IWSLT 2021 multilingual speech translation track. The task in this track is to build multilingual speech translation systems in supervised and zero-shot directions. Our primary system
In this paper, we describe Zhejiang University's submission to the IWSLT2021 Multilingual Speech Translation Task. This task focuses on speech translation (ST) research across many non-English source languages. Participants can decide whether to work
The paper describes BUT's English to German offline speech translation (ST) systems developed for IWSLT2021. They are based on jointly trained Automatic Speech Recognition-Machine Translation models. Their performances is evaluated on MustC-Common te
This paper describes KIT'submission to the IWSLT 2021 Offline Speech Translation Task. We describe a system in both cascaded condition and end-to-end condition. In the cascaded condition, we investigated different end-to-end architectures for the spe
This paper describes the ESPnet-ST group's IWSLT 2021 submission in the offline speech translation track. This year we made various efforts on training data, architecture, and audio segmentation. On the data side, we investigated sequence-level knowl