Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Pre-training a BERT with Curriculum Learning by Increasing Block-Size of Input Text

قبل التدريب برت مع التعلم من المناهج الدراسية عن طريق زيادة حجم كتلة المدخلات

706 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

تسلسل العلامات increasing block-size learning by increasing زيادة حجم كتلة التعلم عن طريق زيادة صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Recently, pre-trained language representation models such as BERT and RoBERTa have achieved significant results in a wide range of natural language processing (NLP) tasks, however, it requires extremely high computational cost. Curriculum Learning (CL) is one of the potential solutions to alleviate this problem. CL is a training strategy where training samples are given to models in a meaningful order instead of random sampling. In this work, we propose a new CL method which gradually increases the block-size of input text for training the self-attention mechanism of BERT and its variants using the maximum available batch-size. Experiments in low-resource settings show that our approach outperforms the baseline in terms of convergence speed and final performance on downstream tasks.

References used

https://aclanthology.org/

rate research

Active Curriculum Learning

508 - Association for Computation Linguistics 2021 مقالة

This paper investigates and reveals the relationship between two closely related machine learning disciplines, namely Active Learning (AL) and Curriculum Learning (CL), from the lens of several novel curricula. This paper also introduces Active Curri culum Learning (ACL) which improves AL by combining AL with CL to benefit from the dynamic nature of the AL informativeness concept as well as the human insights used in the design of the curriculum heuristics. Comparison of the performance of ACL and AL on two public datasets for the Named Entity Recognition (NER) task shows the effectiveness of combining AL and CL using our proposed framework.

active curriculum learning active curriculum المناهج الدراسية النشطة التعلم المناهج الدراسية النشطة صناعة حمض الفوسفور

Pre-training with Meta Learning for Chinese Word Segmentation

799 - Association for Computation Linguistics 2021 مقالة

Recent researches show that pre-trained models (PTMs) are beneficial to Chinese Word Segmentation (CWS). However, PTMs used in previous works usually adopt language modeling as pre-training tasks, lacking task-specific prior segmentation knowledge an d ignoring the discrepancy between pre-training tasks and downstream CWS tasks. In this paper, we propose a CWS-specific pre-trained model MetaSeg, which employs a unified architecture and incorporates meta learning algorithm into a multi-criteria pre-training task. Empirical results show that MetaSeg could utilize common prior segmentation knowledge from different existing criteria and alleviate the discrepancy between pre-trained models and downstream CWS tasks. Besides, MetaSeg can achieve new state-of-the-art performance on twelve widely-used CWS datasets and significantly improve model performance in low-resource settings.

chinese word segmentation chinese word word segmentation تجزئة الكلمة الصينية كلمة صينية كلمة تجزئة صناعة حمض الفوسفور المزيد..

Increasing Input Block and Encryption Efficiency by Mixing BBM & IDEA

1692 - Aِl-Baath University 2017 ورقة بحثية

We take an IDEA Algorithm and add to it some stages depend on BBM to get an Enhanced Algorithm, which had 3keys, 128-bit input block.

algorithms Encryption threading Enhance

Few-Shot Intent Detection via Contrastive Pre-Training and Fine-Tuning

778 - Association for Computation Linguistics 2021 مقالة

In this work, we focus on a more challenging few-shot intent detection scenario where many intents are fine-grained and semantically similar. We present a simple yet effective few-shot intent detection schema via contrastive pre-training and fine-tun ing. Specifically, we first conduct self-supervised contrastive pre-training on collected intent datasets, which implicitly learns to discriminate semantically similar utterances without using any labels. We then perform few-shot intent detection together with supervised contrastive learning, which explicitly pulls utterances from the same intent closer and pushes utterances across different intents farther. Experimental results show that our proposed method achieves state-of-the-art performance on three challenging intent detection datasets under 5-shot and 10-shot settings.

few-shot intent detection الكشف عن القلة الطلقات صناعة حمض الفوسفور

Solving SCAN Tasks with Data Augmentation and Input Embeddings

660 - Association for Computation Linguistics 2021 مقالة

We address the compositionality challenge presented by the SCAN benchmark. Using data augmentation and a modification of the standard seq2seq architecture with attention, we achieve SOTA results on all the relevant tasks from the benchmark, showing t he models can generalize to words used in unseen contexts. We propose an extension of the benchmark by a harder task, which cannot be solved by the proposed method.

input embeddings solving scan tasks augmentation and input المدخلات embeddings. حل مهام المسح الضوئي زيادة وإدخال صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Pre-training a BERT with Curriculum Learning by Increasing Block-Size of Input Text

قبل التدريب برت مع التعلم من المناهج الدراسية عن طريق زيادة حجم كتلة المدخلات

Ask ChatGPT about the research

Read More

suggested questions