New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Classification Of Arabic Texts Using Object Properties In Databases

تصنيف النصوص العربية باستخدام الخصائص العرضية في قواعد البيانات

2409 5 41 0 ( 0 )

Download Cite

Added by Aِl-Baath University ورقة بحثية

Publication date 2016

and research's language is العربية

Authors بسام الديب( باحث ) - مهند رجب( باحث ) - زكريا زكريا( باحث )

Created by Shamra Editor

Data Mining Algorithms قواعد البيانات الغرضية الأغراض النصية التنقيب في البيانات النصية خوارزمية التصنيف البيزياني البيانات غير المهيكلة خوارزمية SVM Object Oriented Database Text Objects Data Mining Texts SVM Algorithm Naïve Bayes Algorithm Unstructured Data

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In our research we offer detailed study of one of the data mining functions within the text data using the object properties in databases. It studies the possibility of applying this function on the Arabic texts. We use procedural query language PL / SQL that deals with the object of Oracle databases. Data mining model Has been built. It works on classification of Arabic texts documents using SVM algorithm for indexing of texts and texts preparation, Naïve Bayes algorithm to classify data after transformation it into nested tables. So we made an evaluation of the obtained results and conclusions.

References used

AGGARWAL, CH ,2014–Data Classification Algorithms and Applications. First Edition, Taylor & Francis Group, LLC, New York, USA,64P

ALPAYDIN, E, 2010-Introduction to Machine Learning. Second Edition, Cambridge, Massachusetts London, England, 579p

BARBER,D,2010-Bayesian Reasoning and Machine Learning. First Edition, Cambridge University Press, London, England, 610p

rate research

Study about Arabic Text Documents Classification using Ontologies

2776 - Aِl-Baath University 2014 ورقة بحثية

In this paper, we introduce an algorithm for grouping Arabic documents for building an ontology and its words. We execute the algorithm on five ontologies using Java. We manage the documents by getting 338667 words with its weights corresponding to each ontology. The algorithm had proved its efficiency in optimizing classifiers (SVM, NB) performance, which we tested in this study, comparing with former classifiers results for Arabic language.

Ontology اللغة العربية Arabic Language semantic web الويب الدلالي Documents classification Text categorization Text mining SVM NB الأنطولوجيا تصنيف المستندات تصنيف النصوص تنقيب النصوص المزيد..

Period Classification in Chinese Historical Texts

358 - Association for Computation Linguistics 2021 مقالة

In this study, we study language change in Chinese Biji by using a classification task: classifying Ancient Chinese texts by time periods. Specifically, we focus on a unique genre in classical Chinese literature: Biji (literally notebook'' or brush n otes''), i.e., collections of anecdotes, quotations, etc., anything authors consider noteworthy, Biji span hundreds of years across many dynasties and conserve informal language in written form. For these reasons, they are regarded as a good resource for investigating language change in Chinese (Fang, 2010). In this paper, we create a new dataset of 108 Biji across four dynasties. Based on the dataset, we first introduce a time period classification task for Chinese. Then we investigate different feature representation methods for classification. The results show that models using contextualized embeddings perform best. An analysis of the top features chosen by the word n-gram model (after bleaching proper nouns) confirms that these features are informative and correspond to observations and assumptions made by historical linguists.

ancient chinese texts chinese historical texts classifying ancient chinese النصوص الصينية القديمة النصوص التاريخية الصينية تصنيف الصينيين القديم صناعة حمض الفوسفور المزيد..

Automatic Difficulty Classification of Arabic Sentences

478 - Association for Computation Linguistics 2021 مقالة

In this paper, we present a Modern Standard Arabic (MSA) Sentence difficulty classifier, which predicts the difficulty of sentences for language learners using either the CEFR proficiency levels or the binary classification as simple or complex. We c ompare the use of sentence embeddings of different kinds (fastText, mBERT , XLM-R and Arabic-BERT), as well as traditional language features such as POS tags, dependency trees, readability scores and frequency lists for language learners. Our best results have been achieved using fined-tuned Arabic-BERT. The accuracy of our 3-way CEFR classification is F-1 of 0.80 and 0.75 for Arabic-Bert and XLM-R classification respectively and 0.71 Spearman correlation for regression. Our binary difficulty classifier reaches F-1 0.94 and F-1 0.98 for sentence-pair semantic similarity classifier.

اللغة العربية المدربة مسبقا automatic difficulty classification standard arabic تصنيف صعوبة التلقائي عربي قياسي صناعة حمض الفوسفور

Arabic documents classification system

3528 - Tishreen University 2012 مشروع تخرج

اخترنا في هذا المشروع العمل على تطوير نظام يقوم بتصنيف المستندات العربية حسب محتواها, يقوم هذه النظام بالتحليل اللفظي لكلمات المستند ثم إجراء عملية Stemming"رد الأفعال إلى أصلها" ثم تطبيق عملية إحصائية على المستند في مرحلة تدريب النظام ثم بالاعتماد على خوارزميات في الذكاء الصنعي يتم تصنيف المستند حسب محتواه ضمن عناقيد

Machine learning Nlp Support vector machine fuzzy system Arabic nlp

Multi-Label Classification of Chinese Humor Texts Using Hypergraph Attention Networks

367 - Association for Computation Linguistics 2021 مقالة

We use Hypergraph Attention Networks (HyperGAT) to recognize multiple labels of Chinese humor texts. We firstly represent a joke as a hypergraph. The sequential hyperedge and semantic hyperedge structures are used to construct hyperedges. Then, atten tion mechanisms are adopted to aggregate context information embedded in nodes and hyperedges. Finally, we use trained HyperGAT to complete the multi-label classification task. Experimental results on the Chinese humor multi-label dataset showed that HyperGAT model outperforms previous sequence-based (CNN, BiLSTM, FastText) and graph-based (Graph-CNN, TextGCN, Text Level GNN) deep learning models.

hypergraph attention networks chinese humor texts attention networks شبكات انتباه Hypergraph الفكاهة الصينية النصوص انتباه الشبكات صناعة حمض الفوسفور المزيد..

comments

Fetching comments

Cordoba Private University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Classification Of Arabic Texts Using Object Properties In Databases

تصنيف النصوص العربية باستخدام الخصائص العرضية في قواعد البيانات

Ask ChatGPT about the research

Read More