Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer

MT5: محول نص إلى نص إلى نص متعدد اللغات

817 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

massively multilingual pre-trained english-language nlp tasks transfer transformer متعدد اللغات بشكل كبير مدرب مسبقا مهام NLP اللغة الإنجليزية نقل المحولات صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The recent Text-to-Text Transfer Transformer'' (T5) leveraged a unified text-to-text format and scale to attain state-of-the-art results on a wide variety of English-language NLP tasks. In this paper, we introduce mT5, a multilingual variant of T5 that was pre-trained on a new Common Crawl-based dataset covering 101 languages. We detail the design and modified training of mT5 and demonstrate its state-of-the-art performance on many multilingual benchmarks. We also describe a simple technique to prevent accidental translation'' in the zero-shot setting, where a generative model chooses to (partially) translate its prediction into the wrong language. All of the code and model checkpoints used in this work are publicly available.

References used

https://aclanthology.org/

rate research

mT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs

739 - Association for Computation Linguistics 2021 مقالة

Multilingual T5 pretrains a sequence-to-sequence model on massive monolingual texts, which has shown promising results on many cross-lingual tasks. In this paper, we improve multilingual text-to-text transfer Transformer with translation pairs (mT6). Specifically, we explore three cross-lingual text-to-text pre-training tasks, namely, machine translation, translation pair span corruption, and translation span corruption. In addition, we propose a partially non-autoregressive objective for text-to-text pre-training. We evaluate the methods on seven multilingual benchmark datasets, including sentence classification, named entity recognition, question answering, and abstractive summarization. Experimental results show that the proposed mT6 improves cross-lingual transferability over mT5.

multilingual pretrained احتجاز متعدد اللغات صناعة حمض الفوسفور

Controllable Sentence Simplification with a Unified Text-to-Text Transfer Transformer

860 - Association for Computation Linguistics 2021 مقالة

Recently, a large pre-trained language model called T5 (A Unified Text-to-Text Transfer Transformer) has achieved state-of-the-art performance in many NLP tasks. However, no study has been found using this pre-trained model on Text Simplification. Th erefore in this paper, we explore the use of T5 fine-tuning on Text Simplification combining with a controllable mechanism to regulate the system outputs that can help generate adapted text for different target audiences. Our experiments show that our model achieves remarkable results with gains of between +0.69 and +1.41 over the current state-of-the-art (BART+ACCESS). We argue that using a pre-trained model such as T5, trained on several tasks with large amounts of data, can help improve Text Simplification.

فك التشفير العاطفي controllable sentence simplification sentence simplification تبسيط الجملة القابلة للتحكم تبسيط الجملة صناعة حمض الفوسفور

CoTexT: Multi-task Learning with Code-Text Transformer

658 - Association for Computation Linguistics 2021 مقالة

We present CoTexT, a pre-trained, transformer-based encoder-decoder model that learns the representative context between natural language (NL) and programming language (PL). Using self-supervision, CoTexT is pre-trained on large programming language corpora to learn a general understanding of language and code. CoTexT supports downstream NL-PL tasks such as code summarizing/documentation, code generation, defect detection, and code debugging. We train CoTexT on different combinations of available PL corpus including both bimodal'' and unimodal'' data. Here, bimodal data is the combination of text and corresponding code snippets, whereas unimodal data is merely code snippets. We first evaluate CoTexT with multi-task learning: we perform Code Summarization on 6 different programming languages and Code Refinement on both small and medium size featured in the CodeXGLUE dataset. We further conduct extensive experiments to investigate CoTexT on other tasks within the CodeXGlue dataset, including Code Generation and Defect Detection. We consistently achieve SOTA results in these tasks, demonstrating the versatility of our models.

code-text transformer code محول النص رمز الشفرة صناعة حمض الفوسفور

RE-AIMing Predictive Text

846 - Association for Computation Linguistics 2021 مقالة

Our increasing reliance on mobile applications means much of our communication is mediated with the support of predictive text systems. How do these systems impact interpersonal communication and broader society? In what ways are predictive text syst ems harmful, to whom, and why? In this paper, we focus on predictive text systems on mobile devices and attempt to answer these questions. We introduce the concept of a text entry intervention' as a way to evaluate predictive text systems through an interventional lens, and consider the Reach, Effectiveness, Adoption, Implementation, and Maintenance (RE-AIM) of predictive text systems. We finish with a discussion of opportunities for NLP.

predictive text systems predictive text re-aiming predictive text أنظمة النص التنبؤية نص تنبؤي إعادة تهدف إلى نص التنبؤ صناعة حمض الفوسفور المزيد..

DiscoDVT: Generating Long Text with Discourse-Aware Discrete Variational Transformer

656 - Association for Computation Linguistics 2021 مقالة

Despite the recent advances in applying pre-trained language models to generate high-quality texts, generating long passages that maintain long-range coherence is yet challenging for these models. In this paper, we propose DiscoDVT, a discourse-aware discrete variational Transformer to tackle the incoherence issue. DiscoDVT learns a discrete variable sequence that summarizes the global structure of the text and then applies it to guide the generation process at each decoding step. To further embed discourse-aware information into the discrete latent representations, we introduce an auxiliary objective to model the discourse relations within the text. We conduct extensive experiments on two open story generation datasets and demonstrate that the latent codes learn meaningful correspondence to the discourse structures that guide the model to generate long texts with better long-range coherence.

discrete variational transformer generating long text discourse-aware discrete variational محول متغيرات منفصلة توليد نص طويل خطاب علم متغيرات منفصلة صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer

MT5: محول نص إلى نص إلى نص متعدد اللغات

Ask ChatGPT about the research

Read More

suggested questions