تلقت ترجمة الكلام (ST) مؤخرا اهتماما متزايدا بتوليد الترجمات دون الحاجة إلى نسخ لغة مصدر ومتوسط توقيت (I.E. التوضيحية).ومع ذلك، فإن الجيل المشترك من مشاريع المصدر والترجمات المستهدفة لا يجلب فقط مزايا جودة الإخراج المحتملة عندما تقوم عمليات فك التشفير بإبلاغ بعضها البعض، ولكنها غالبا ما تكون مطلوبة في سيناريوهات متعددة اللغات.في هذا العمل، نركز على النماذج St النماذج التي تولد عمليات تعليقا ثابتا من حيث الهيكل والمحتوى المعجمي.نقدم مقاييس جديدة لتقييم الاتساق الفرعي.تظهر النتائج التي توصلنا إليها أن فك التشفير المشترك يؤدي إلى زيادة الأداء والاتساق بين التسميات التوضيحية والترجمات التي تم إنشاؤها والتي لا تزال تسمح بمرونة كافية لإنتاج ترجمات تتوافق مع الاحتياجات والمعايير الخاصة باللغة.
Speech translation (ST) has lately received growing interest for the generation of subtitles without the need for an intermediate source language transcription and timing (i.e. captions). However, the joint generation of source captions and target subtitles does not only bring potential output quality advantages when the two decoding processes inform each other, but it is also often required in multilingual scenarios. In this work, we focus on ST models which generate consistent captions-subtitles in terms of structure and lexical content. We further introduce new metrics for evaluating subtitling consistency. Our findings show that joint decoding leads to increased performance and consistency between the generated captions and subtitles while still allowing for sufficient flexibility to produce subtitles conforming to language-specific needs and norms.
References used
https://aclanthology.org/
Leveraging large-scale unlabeled web videos such as instructional videos for pre-training followed by task-specific finetuning has become the de facto approach for many video-and-language tasks. However, these instructional videos are very noisy, the
Although showing promising values to downstream applications, generating question and answer together is under-explored. In this paper, we introduce a novel task that targets question-answer pair generation from visual images. It requires not only ge
We study the problem of generating arithmetic math word problems (MWPs) given a math equation that specifies the mathematical computation and a context that specifies the problem scenario. Existing approaches are prone to generating MWPs that are eit
The paper describes a system for automatic summarization in English language of online news data that come from different non-English languages. The system is designed to be used in production environment for media monitoring. Automatic summarization
Non-autoregressive neural machine translation (NART) models suffer from the multi-modality problem which causes translation inconsistency such as token repetition. Most recent approaches have attempted to solve this problem by implicitly modeling dep