Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

An Architecture for Accelerated Large-Scale Inference of Transformer-Based Language Models

بنية لتسريع الاستدلال على نطاق واسع النماذج اللغوية القائمة على المحولات

407 0 1 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This work demonstrates the development process of a machine learning architecture for inference that can scale to a large volume of requests. We used a BERT model that was fine-tuned for emotion analysis, returning a probability distribution of emotions given a paragraph. The model was deployed as a gRPC service on Kubernetes. Apache Spark was used to perform inference in batches by calling the service. We encountered some performance and concurrency challenges and created solutions to achieve faster running time. Starting with 200 successful inference requests per minute, we were able to achieve as high as 18 thousand successful requests per minute with the same batch job resource allocation. As a result, we successfully stored emotion probabilities for 95 million paragraphs within 96 hours.

References used

https://aclanthology.org/

rate research

Compressing Large-Scale Transformer-Based Models: A Case Study on BERT

278 - Association for Computation Linguistics 2021 مقالة

Abstract Pre-trained Transformer-based models have achieved state-of-the-art performance for various Natural Language Processing (NLP) tasks. However, these models often have billions of parameters, and thus are too resource- hungry and computation-i ntensive to suit low- capability devices or applications with strict latency requirements. One potential remedy for this is model compression, which has attracted considerable research attention. Here, we summarize the research in compressing Transformers, focusing on the especially popular BERT model. In particular, we survey the state of the art in compression for BERT, we clarify the current best practices for compressing large-scale Transformer models, and we provide insights into the workings of various methods. Our categorization and analysis also shed light on promising future research directions for achieving lightweight, accurate, and generic NLP models.

نماذج اللغة المستقبلية abstract pre-trained transformer-based مجردة محول المدرب مسبقا صناعة حمض الفوسفور

Benchmarking Transformer-based Language Models for Arabic Sentiment and Sarcasm Detection

366 - Association for Computation Linguistics 2021 مقالة

The introduction of transformer-based language models has been a revolutionary step for natural language processing (NLP) research. These models, such as BERT, GPT and ELECTRA, led to state-of-the-art performance in many NLP tasks. Most of these mode ls were initially developed for English and other languages followed later. Recently, several Arabic-specific models started emerging. However, there are limited direct comparisons between these models. In this paper, we evaluate the performance of 24 of these models on Arabic sentiment and sarcasm detection. Our results show that the models achieving the best performance are those that are trained on only Arabic data, including dialectal Arabic, and use a larger number of parameters, such as the recently released MARBERT. However, we noticed that AraELECTRA is one of the top performing models while being much more efficient in its computational cost. Finally, the experiments on AraGPT2 variants showed low performance compared to BERT models, which indicates that it might not be suitable for classification tasks.

benchmarking transformer-based language transformer-based language models transformer-based language معايير اللغة القائمة على المحولات نماذج اللغة القائمة على المحولات اللغة القائمة على المحولات صناعة حمض الفوسفور المزيد..

Large-Scale Contextualised Language Modelling for Norwegian

437 - Association for Computation Linguistics 2021 مقالة

We present the ongoing NorLM initiative to support the creation and use of very large contextualised language models for Norwegian (and in principle other Nordic languages), including a ready-to-use software environment, as well as an experience repo rt for data preparation and training. This paper introduces the first large-scale monolingual language models for Norwegian, based on both the ELMo and BERT frameworks. In addition to detailing the training process, we present contrastive benchmark results on a suite of NLP tasks for Norwegian. For additional background and access to the data, models, and software, please see: http://norlm.nlpl.eu

contextualised language modelling modelling for norwegian contextualised language models النمذجة اللغة السياقية النمذجة للنرويجية صناعة حمض الفوسفور

Probing for Bridging Inference in Transformer Language Models

345 - Association for Computation Linguistics 2021 مقالة

We probe pre-trained transformer language models for bridging inference. We first investigate individual attention heads in BERT and observe that attention heads at higher layers prominently focus on bridging relations in-comparison with the lower an d middle layers, also, few specific attention heads concentrate consistently on bridging. More importantly, we consider language models as a whole in our second approach where bridging anaphora resolution is formulated as a masked token prediction task (Of-Cloze test). Our formulation produces optimistic results without any fine-tuning, which indicates that pre-trained language models substantially capture bridging inference. Our further investigation shows that the distance between anaphor-antecedent and the context provided to language models play an important role in the inference.

transformer language models طرازات لغة المحول صناعة حمض الفوسفور

QuadrupletBERT: An Efficient Model For Embedding-Based Large-Scale Retrieval

498 - Association for Computation Linguistics 2021 مقالة

The embedding-based large-scale query-document retrieval problem is a hot topic in the information retrieval (IR) field. Considering that pre-trained language models like BERT have achieved great success in a wide variety of NLP tasks, we present a Q uadrupletBERT model for effective and efficient retrieval in this paper. Unlike most existing BERT-style retrieval models, which only focus on the ranking phase in retrieval systems, our model makes considerable improvements to the retrieval phase and leverages the distances between simple negative and hard negative instances to obtaining better embeddings. Experimental results demonstrate that our QuadrupletBERT achieves state-of-the-art results in embedding-based large-scale retrieval tasks.

embedding-based large-scale retrieval embedding-based large-scale embedding-based large-scale query-document تضمين استرجاع واسع النطاق تضمين واسع النطاق استشانة واسعة النطاق على نطاق واسع صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

An Architecture for Accelerated Large-Scale Inference of Transformer-Based Language Models

بنية لتسريع الاستدلال على نطاق واسع النماذج اللغوية القائمة على المحولات

Ask ChatGPT about the research

Read More

suggested questions