Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

ALUE: Arabic Language Understanding Evaluation

Alue: التقييم في اللغة العربية

956 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The emergence of Multi-task learning (MTL)models in recent years has helped push thestate of the art in Natural Language Un-derstanding (NLU). We strongly believe thatmany NLU problems in Arabic are especiallypoised to reap the benefits of such models. Tothis end we propose the Arabic Language Un-derstanding Evaluation Benchmark (ALUE),based on 8 carefully selected and previouslypublished tasks. For five of these, we providenew privately held evaluation datasets to en-sure the fairness and validity of our benchmark.We also provide a diagnostic dataset to helpresearchers probe the inner workings of theirmodels.Our initial experiments show thatMTL models outperform their singly trainedcounterparts on most tasks. But in order to en-tice participation from the wider community,we stick to publishing singly trained baselinesonly. Nonetheless, our analysis reveals thatthere is plenty of room for improvement inArabic NLU. We hope that ALUE will playa part in helping our community realize someof these improvements. Interested researchersare invited to submit their results to our online,and publicly accessible leaderboard.

References used

https://aclanthology.org/

rate research

AraELECTRA: Pre-Training Text Discriminators for Arabic Language Understanding

729 - Association for Computation Linguistics 2021 مقالة

Advances in English language representation enabled a more sample-efficient pre-training task by Efficiently Learning an Encoder that Classifies Token Replacements Accurately (ELECTRA). Which, instead of training a model to recover masked tokens, it trains a discriminator model to distinguish true input tokens from corrupted tokens that were replaced by a generator network. On the other hand, current Arabic language representation approaches rely only on pretraining via masked language modeling. In this paper, we develop an Arabic language representation model, which we name AraELECTRA. Our model is pretrained using the replaced token detection objective on large Arabic text corpora. We evaluate our model on multiple Arabic NLP tasks, including reading comprehension, sentiment analysis, and named-entity recognition and we show that AraELECTRA outperforms current state-of-the-art Arabic language representation models, given the same pretraining data and with even a smaller model size.

تقييم اللغة التقييم arabic language representation تمثيل اللغة العربية صناعة حمض الفوسفور

Expert Neural System to parse Arabic Language

1154 - Damascus University 2007 ورقة بحثية

New intelligent neural network built in an expert system has been designed to parse Arabic Language. Arabic sentences have been studied and analyzed, also classified into new syntactical fields. Each syntactical field consists of essential sentenc e components; verb, object, ….All emerging Arabic sentences have been calculated and detailed into verbal and noun fields.

منظومة عصبونية خبيرة إعراب اللغة العربية Expert Neural System parse Arabic Language

Intrinsic evaluation of language models for code-switching

591 - Association for Computation Linguistics 2021 مقالة

Language models used in speech recognition are often either evaluated intrinsically using perplexity on test data, or extrinsically with an automatic speech recognition (ASR) system. The former evaluation does not always correlate well with ASR perfo rmance, while the latter could be specific to particular ASR systems. Recent work proposed to evaluate language models by using them to classify ground truth sentences among alternative phonetically similar sentences generated by a fine state transducer. Underlying such an evaluation is the assumption that the generated sentences are linguistically incorrect. In this paper, we first put this assumption into question, and observe that alternatively generated sentences could often be linguistically correct when they differ from the ground truth by only one edit. Secondly, we showed that by using multi-lingual BERT, we can achieve better performance than previous work on two code-switching data sets. Our implementation is publicly available on Github at https://github.com/sikfeng/language-modelling-for-code-switching.

language models intrinsic evaluation نماذج اللغة عسير التقييم الجوهري صناعة حمض الفوسفور

Vowels In Hebrew-A Comparative Study With The Arabic Language

2563 - Tishreen University 2013 ورقة بحثية

Knowing the vowels in the Hebrew language is one of the most important obstacles faced by learners of the Hebrew language, because of the complexity compared to their counterparts in the Arabic language. I have worked hard, in my research, on simplifying them, as far as possible, for the Arab recipients through comparing them to their counterparts in the Arabic language. This research may show us that most Vowels in Hebrew have similar counterparts in Arabic, but Arab linguist did not allocate an independent vowel for each case as Hebrew linguists did, which suggests to the neophyte that the number of the symbols of vowels in Hebrew is larger than the number of those in Arabic.

Vowels Hebrew language comparison الحركات اللغة العبرية مقارنة

Arabic Offensive Language on Twitter: Analysis and Experiments

686 - Association for Computation Linguistics 2021 مقالة

Detecting offensive language on Twitter has many applications ranging from detecting/predicting bullying to measuring polarization. In this paper, we focus on building a large Arabic offensive tweet dataset. We introduce a method for building a datas et that is not biased by topic, dialect, or target. We produce the largest Arabic dataset to date with special tags for vulgarity and hate speech. We thoroughly analyze the dataset to determine which topics, dialects, and gender are most associated with offensive tweets and how Arabic speakers useoffensive language. Lastly, we conduct many experiments to produce strong results (F1 =83.2) on the dataset using SOTA techniques.

language on twitter arabic offensive language اللغة على Twitter اللغة الهجومية العربية صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

ALUE: Arabic Language Understanding Evaluation

Alue: التقييم في اللغة العربية

Ask ChatGPT about the research

Read More

suggested questions