Research papers, master and doctoral theses about Arabic Language

ArabicTransformer: Efficient Large Arabic Language Model with Funnel Transformer and ELECTRA Objective

554 - Association for Computation Linguistics 2021 مقالة

Pre-training Transformer-based models such as BERT and ELECTRA on a collection of Arabic corpora, demonstrated by both AraBERT and AraELECTRA, shows an impressive result on downstream tasks. However, pre-training Transformer-based language models is computationally expensive, especially for large-scale models. Recently, Funnel Transformer has addressed the sequential redundancy inside Transformer architecture by compressing the sequence of hidden states, leading to a significant reduction in the pre-training cost. This paper empirically studies the performance and efficiency of building an Arabic language model with Funnel Transformer and ELECTRA objective. We find that our model achieves state-of-the-art results on several Arabic downstream tasks despite using less computational resources compared to other BERT-based models.

efficient large arabic efficient large large arabic language كبيرة كبيرة العربية كبير الكفاءة لغة عربية كبيرة صناعة حمض الفوسفور المزيد..

ALUE: Arabic Language Understanding Evaluation

1059 - Association for Computation Linguistics 2021 مقالة

The emergence of Multi-task learning (MTL)models in recent years has helped push thestate of the art in Natural Language Un-derstanding (NLU). We strongly believe thatmany NLU problems in Arabic are especiallypoised to reap the benefits of such model s. Tothis end we propose the Arabic Language Un-derstanding Evaluation Benchmark (ALUE),based on 8 carefully selected and previouslypublished tasks. For five of these, we providenew privately held evaluation datasets to en-sure the fairness and validity of our benchmark.We also provide a diagnostic dataset to helpresearchers probe the inner workings of theirmodels.Our initial experiments show thatMTL models outperform their singly trainedcounterparts on most tasks. But in order to en-tice participation from the wider community,we stick to publishing singly trained baselinesonly. Nonetheless, our analysis reveals thatthere is plenty of room for improvement inArabic NLU. We hope that ALUE will playa part in helping our community realize someof these improvements. Interested researchersare invited to submit their results to our online,and publicly accessible leaderboard.

arabic language understanding language understanding evaluation natural language un-derstanding فهم اللغة العربية تقييم اللغة التقييم صناعة حمض الفوسفور

AraELECTRA: Pre-Training Text Discriminators for Arabic Language Understanding

838 - Association for Computation Linguistics 2021 مقالة

Advances in English language representation enabled a more sample-efficient pre-training task by Efficiently Learning an Encoder that Classifies Token Replacements Accurately (ELECTRA). Which, instead of training a model to recover masked tokens, it trains a discriminator model to distinguish true input tokens from corrupted tokens that were replaced by a generator network. On the other hand, current Arabic language representation approaches rely only on pretraining via masked language modeling. In this paper, we develop an Arabic language representation model, which we name AraELECTRA. Our model is pretrained using the replaced token detection objective on large Arabic text corpora. We evaluate our model on multiple Arabic NLP tasks, including reading comprehension, sentiment analysis, and named-entity recognition and we show that AraELECTRA outperforms current state-of-the-art Arabic language representation models, given the same pretraining data and with even a smaller model size.

تقييم اللغة التقييم arabic language representation تمثيل اللغة العربية صناعة حمض الفوسفور

Empathetic BERT2BERT Conversational Model: Learning Arabic Language Generation with Little Data

1286 - Association for Computation Linguistics 2021 مقالة

Enabling empathetic behavior in Arabic dialogue agents is an important aspect of building human-like conversational models. While Arabic Natural Language Processing has seen significant advances in Natural Language Understanding (NLU) with language m odels such as AraBERT, Natural Language Generation (NLG) remains a challenge. The shortcomings of NLG encoder-decoder models are primarily due to the lack of Arabic datasets suitable to train NLG models such as conversational agents. To overcome this issue, we propose a transformer-based encoder-decoder initialized with AraBERT parameters. By initializing the weights of the encoder and decoder with AraBERT pre-trained weights, our model was able to leverage knowledge transfer and boost performance in response generation. To enable empathy in our conversational model, we train it using the ArabicEmpatheticDialogues dataset and achieve high performance in empathetic response generation. Specifically, our model achieved a low perplexity value of 17.0 and an increase in 5 BLEU points compared to the previous state-of-the-art model. Also, our proposed model was rated highly by 85 human evaluators, validating its high capability in exhibiting empathy while generating relevant and fluent responses in open-domain settings.

learning arabic language arabic natural language تعلم اللغة العربية اللغة العربية الطبيعية صناعة حمض الفوسفور

The level of exercise of Arabic language teachers for the skills of the development of creative thinking in the Directorate of Education in the North Eastern Badia region

2266 - Tishreen University 2018 ورقة بحثية

The present study aimed to detect the degree of exercise The Arabic language teachers for creative thinking skills in the Directorate of Education for the North Eastern Badia region. The study's sample consisted of (200) The Arabic language teacher s for sixth and seven grades. To achieve the objectives of the study, the researcher used a questionnaire composed of (63) items. The results of the study showed that the degree of exercise The Arabic language teachers for creative thinking skills development of the student was moderate on the instrument total score, and in the fields of freedom of expression, the positive perspective towards creativity, teaching methods, methods of evaluation, the class environment, and creativity stimulation. Results of the study also pointed to the lack of a statistically significant degree in exercise The Arabic language teachers in the Directorate of Education for the North Eastern Badia region for creative thinking skills development differences depending on the variable: gender, experience, and qualifications of all fields of study. Accordingly, the study concluded that a number of recommendations related.

التفكير الإبداعي creative thinking معلمي اللغة العربية Arabic language teachers

Linguistic performances of Arabic language teachers and its relation to their Attitudes toward Teaching

2245 - Tishreen University 2017 ورقة بحثية

The study aimed at investigating linguistic performances of the teachers of Arabic language and their relation to their attitudes towards teaching. The sample of the study consisted of 40 Arabic teachers from the public schools in the Northeastern Badia Directorate of Education. To achieve the purpose of study, analytical descriptive approach was used. The instruments of the study were a note card, and a measure of trends towards the teaching. The results of the study showed that the linguistic performances of Arabic teachers and their attitudes toward teaching were medium which indicates a strong correlation between their linguistic performances and their attitudes toward teaching.

الأداء اللغوي الاتجاهات نحو تدريس اللغة العربية Language Performance Attitudes towards Teaching Arabic Language

The Degree of Including the Linguistic performance in Arabic Language Curricula: Analytical Study

2053 - Tishreen University 2016 ورقة بحثية

This study aimed at analyzing the level of involvement of the linguistic performance in Arabic language curricula, represented by language skills: listening, conversation, reading and writing, in accordance to the outcomes of teaching embodied in t he objectives in order to keep an eye on the appropriateness of the content of the Arabic language curriculum for the predetermined objectives. The results of analysis showed the following: the percentage of representation of the content for listening comprehension for all grades is “80.75”; the percentage of representation of the content for writing skills for all grades is “84.3”; the percentage of representation of the content for conversation skills for all grades is “91.25”; the percentage of representation of the content for reading comprehension for all grades is “92.8”. The results of analysis on the level of curriculum showed that: the percentage of representation of the content for all skills for the first grade is “89.5”; the percentage of representation of the content for all skills for the second grade is “89.125”; the percentage of representation of the content for all skills for the third grade is “87.875”; the percentage of representation of the content for all skills for the fourth grade is “87.375”; the percentage of representation of the content for all skills for the fifth grade is “85.5”; the percentage of representation of the content for all skills for the sixth grade is “84.375”. The study concluded with a number of recommendations.

مناهج اللغة العربية الأداء اللغوي Arabic language curriculum linguistic performance

Study about Arabic Text Documents Classification using Ontologies

3320 - Aِl-Baath University 2014 ورقة بحثية

In this paper, we introduce an algorithm for grouping Arabic documents for building an ontology and its words. We execute the algorithm on five ontologies using Java. We manage the documents by getting 338667 words with its weights corresponding to each ontology. The algorithm had proved its efficiency in optimizing classifiers (SVM, NB) performance, which we tested in this study, comparing with former classifiers results for Arabic language.

Ontology اللغة العربية Arabic Language semantic web الويب الدلالي Documents classification Text categorization Text mining SVM NB الأنطولوجيا تصنيف المستندات تصنيف النصوص تنقيب النصوص المزيد..

Exploring Arabic text diacritization approaches in view of establishing an action plan for developing an open source diacritizer

3164 - Damascus University 2012 ورقة بحثية

The absence of diacritization in Arabic texts is one of the most important challenges facing the automatic Arabic Language processing. When reading, Arabic reader can expect the correct diacritics of words, while computers need algorithms to restor e the diacritization based on knowledge of different levels. Diacritization here includes all the diacritics (dama, fatha, kasra, sokon), in addition to alshadda, and altanween. Some diacritization methods are based on the linguistic processing of texts, while other methods are based on statistical methods using textual corpus. Some systems integrate the two methodologies in hybrid approaches. In this paper we present a comprehensive study of different methods that have been adopted in these diacritization systems. In addition, we review the various corpuses that have been used for tests and evaluation, then suggest the specifications of the Arabic corpus needed for diacritization systems, and the standards that the evaluation process must take into consideration. The main objective is to develop an action plan for the construction of an automatic diacritizer of Arabic texts under the auspices of ALECSO, with the participation of many research entities from different countries.

المعالجة الآلية للغة الطبيعية التشكيل الآلي للنصوص مدونات التقويم تقويم المشكلات الآلية automatic Arabic language processing automatic diacritization of Arabic texts evaluation corpus diacritizaters evaluation المزيد..

Collaborative Enrichment of Interactive Arabic Dictionary

2159 - Damascus University 2012 ورقة بحثية

In this paper we present a web-based Interactive Arabic Dictionary developed in HIAST (Higher Institute for Applied Sciences and Technology). Users can search online any Arabic word. The system provides different meanings with example sentences and multimedia illustrations, in addition to other related information like associated words, semantic domains, expressions, linguistic avails, common mistakes, and morphologic, syntactic and semantic information. The dictionary can be enriched collaboratively by expert users with new words, new meanings for available entries, or other morphological, syntactic, and semantic related information.

automatic Arabic language processing معالجة اللغة العربية حاسوبياً معجم اللغة العربية التفاعلي معجم تشاركي على الوب Interactive Arabic Dictionary Collaborative Web dictionary

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد