UER: An Open-Source Toolkit for Pre-training Models

216 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Zhe Zhao

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Zhe Zhao - Hui Chen - Jinbin Zhang

الحساب واللغة التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Existing works, including ELMO and BERT, have revealed the importance of pre-training for NLP tasks. While there does not exist a single pre-training model that works best in all cases, it is of necessity to develop a framework that is able to deploy various pre-training models efficiently. For this purpose, we propose an assemble-on-demand pre-training toolkit, namely Universal Encoder Representations (UER). UER is loosely coupled, and encapsulated with rich modules. By assembling modules on demand, users can either reproduce a state-of-the-art pre-training model or develop a pre-training model that remains unexplored. With UER, we have built a model zoo, which contains pre-trained models based on different corpora, encoders, and targets (objectives). With proper pre-trained models, we could achieve new state-of-the-art results on a range of downstream datasets.

قيم البحث

183 - Fei Huang , Dazhen Wan , Zhihong Shao 2020

In text generation evaluation, many practical issues, such as inconsistent experimental settings and metric implementations, are often ignored but lead to unfair evaluation and untenable conclusions. We present CoTK, an open-source toolkit aiming to support fast development and fair evaluation of text generation. In model development, CoTK helps handle the cumbersome issues, such as data processing, metric implementation, and reproduction. It standardizes the development steps and reduces human errors which may lead to inconsistent experimental settings. In model evaluation, CoTK provides implementation for many commonly used metrics and benchmark models across different experimental settings. As a unique feature, CoTK can signify when and which metric cannot be fairly compared. We demonstrate that it is convenient to use CoTK for model development and evaluation, particularly across different experimental settings.

الحساب واللغة التعلم الآلي

OpenAttack: An Open-source Textual Adversarial Attack Toolkit

187 - Guoyang Zeng , Fanchao Qi , Qianrui Zhou 2020

Textual adversarial attacking has received wide and increasing attention in recent years. Various attack models have been proposed, which are enormously distinct and implemented with different programming frameworks and settings. These facts hinder q uick utilization and apt comparison of attack models. In this paper, we present an open-source textual adversarial attack toolkit named OpenAttack. It currently builds in 12 typical attack models that cover all the attack types. Its highly inclusive modular design not only supports quick utilization of existing attack models, but also enables great flexibility and extensibility. OpenAttack has broad uses including comparing and evaluating attack models, measuring robustness of a victim model, assisting in developing new attack models, and adversarial training. Source code, built-in models and documentation can be obtained at https://github.com/thunlp/OpenAttack.

الحساب واللغة الذكاء الاصطناعي التشفير والأمن

OpenNMT: Open-Source Toolkit for Neural Machine Translation

79 - Guillaume Klein , Yoon Kim , Yuntian Deng 2017

We describe an open-source toolkit for neural machine translation (NMT). The toolkit prioritizes efficiency, modularity, and extensibility with the goal of supporting NMT research into model architectures, feature representations, and source modaliti es, while maintaining competitive performance and reasonable training requirements. The toolkit consists of modeling and translation support, as well as detailed pedagogical documentation about the underlying techniques.

الحساب واللغة الذكاء الاصطناعي الحوسبة العصبية والتطورية

CytonMT: an Efficient Neural Machine Translation Open-source Toolkit Implemented in C++

65 - Xiaolin Wang , Masao Utiyama , Eiichiro Sumita 2018

This paper presents an open-source neural machine translation toolkit named CytonMT (https://github.com/arthurxlw/cytonMt). The toolkit is built from scratch only using C++ and NVIDIAs GPU-accelerated libraries. The toolkit features training efficien cy, code simplicity and translation quality. Benchmarks show that CytonMT accelerates the training speed by 64.5% to 110.8% on neural networks of various sizes, and achieves competitive translation quality.

الحساب واللغة

Multi-stage Pre-training over Simplified Multimodal Pre-training Models

96 - Tongtong Liu , Fangxiang Feng , Xiaojie Wang 2021

Multimodal pre-training models, such as LXMERT, have achieved excellent results in downstream tasks. However, current pre-trained models require large amounts of training data and have huge model sizes, which make them difficult to apply in low-resou rce situations. How to obtain similar or even better performance than a larger model under the premise of less pre-training data and smaller model size has become an important problem. In this paper, we propose a new Multi-stage Pre-training (MSP) method, which uses information at different granularities from word, phrase to sentence in both texts and images to pre-train the model in stages. We also design several different pre-training tasks suitable for the information granularity in different stage in order to efficiently capture the diverse knowledge from a limited corpus. We take a Simplified LXMERT (LXMERT- S), which has only 45.9% parameters of the original LXMERT model and 11.76% of the original pre-training data as the testbed of our MSP method. Experimental results show that our method achieves comparable performance to the original LXMERT model in all downstream tasks, and even outperforms the original model in Image-Text Retrieval task.

الحساب واللغة