Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Automatic detection of plagiarism in Arabic documents based on lexical chains

كشف حالات الإنتحال في النصوص المدونة باللغة العربية بالإعتماد على السلاسل اللغوية

1374 1 0 0.0 ( 0 )

Download Cite

Added by جامعة صفاقس ورقة بحثية

Publication date 2011

fields Informatics Engineering

and research's language is العربية

Authors ماهر الجوة( باحث ) - فاطمة القلال الجوة( باحث ) - لمياء هدريش بلغيث1( باحث ) - عبد المجيد بن حمادو2( باحث )

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper deals with automatic detection of plagiarism in Arabic documents. We present in this paper a new idea based on the experimentation of lexical chains. The proposed method extracts those chains from original document and uses a search engine to verify if such chains occur in other documents. The second step in our methods uses automatic translation system to translate lexical chains and verify by using search engine if those chain occurs in document in other languages. Then we compute a correlation ratio between lexical chains and lexical chains extracted from documents provided by the search engine to detect plagiarism in the original document. We present in the end of this paper our prototype called « Alkachef » developed to detect plagiarism in Arabic document .

References used

Belguith L., Baccour L., Mourad G., “Segmentation de textes arabes basée sur l'analyse contextuelle des signes de ponctuations et de certaines particules”, Actes de la 12ème conférence sur le Traitement Automatique des Langues Naturelles TALN’2005, , Vol. 1, p. 451–456.Dourdan France, 6–10, Juin 2005.

Morris, J., Hirst G., “Lexical cohesion computed by thesaural relations as an indicator of the structure of text”. in Computational Linguistics 17(1): pp. 21 43, 1991

Seaward L., Matwin S., Intrinsic Plagiarism Detection using Complexity Analysis”, in PAN'09, pp. 56-61, 2009.

rate research

Plagiarism Detection in Arabic Language using Rhetorical Structure Theory

2805 - Damascus University 2014 ورقة بحثية

This paper presents a review of available algorithms and plagiarism detection systems، and an implementation of Plagiarism Detection System using available search engines on the web. Plagiarism detection in natural language documents is a complicat ed problem and it is related to the characteristics of the language itself. There are many available algorithms for plagiarism detection in natural languages .Generally these algorithms belong to two main categories ; the first one is plagiarism detection algorithms based on fingerprint and the second is plagiarism detection algorithms based on content comparison and includes string matching and tree matching algorithms . Usually available systems of plagiarism detection use specific type of detection algorithms or use a mixture of detection algorithms to achieve effective detection systems (fast and accurate). In this research, a plagiarism detection system has been developed using Bing search engine and a plagiarism detection algorithm based on Rhetorical Structure Theory.

كشف الانتحال في اللغة العربية نظرية بنية البكلام البلاغية معالجة اللغات الطبيعية Arabic plagiarism detection Rhetorical Structure Theory Natural language processing

English-Arabic Cross-language Plagiarism Detection

870 - Association for Computation Linguistics 2021 مقالة

The advancement of the web and information technology has contributed to the rapid growth of digital libraries and automatic machine translation tools which easily translate texts from one language into another. These have increased the content acces sible in different languages, which results in easily performing translated plagiarism, which are referred to as cross-language plagiarism''. Recognition of plagiarism among texts in different languages is more challenging than identifying plagiarism within a corpus written in the same language. This paper proposes a new technique for enhancing English-Arabic cross-language plagiarism detection at the sentence level. This technique is based on semantic and syntactic feature extraction using word order, word embedding and word alignment with multilingual encoders. Those features, and their combination with different machine learning (ML) algorithms, are then used in order to aid the task of classifying sentences as either plagiarized or non-plagiarized. The proposed approach has been deployed and assessed using datasets presented at SemEval-2017. Analysis of experimental data demonstrates that utilizing extracted features and their combinations with various ML classifiers achieves promising results.

cross-language plagiarism detection english-arabic cross-language plagiarism cross-language plagiarism الكشف عن الانتحال باللغة عبر اللغة الانتحال الإنجليزية والعربية الانتحال عبر اللغة صناعة حمض الفوسفور المزيد..

Study about Arabic Text Documents Classification using Ontologies

3336 - Aِl-Baath University 2014 ورقة بحثية

In this paper, we introduce an algorithm for grouping Arabic documents for building an ontology and its words. We execute the algorithm on five ontologies using Java. We manage the documents by getting 338667 words with its weights corresponding to each ontology. The algorithm had proved its efficiency in optimizing classifiers (SVM, NB) performance, which we tested in this study, comparing with former classifiers results for Arabic language.

Ontology اللغة العربية Arabic Language semantic web الويب الدلالي Documents classification Text categorization Text mining SVM NB الأنطولوجيا تصنيف المستندات تصنيف النصوص تنقيب النصوص المزيد..

iCompass at Shared Task on Sarcasm and Sentiment Detection in Arabic

999 - Association for Computation Linguistics 2021 مقالة

We describe our submitted system to the 2021 Shared Task on Sarcasm and Sentiment Detection in Arabic (Abu Farha et al., 2021). We tackled both subtasks, namely Sarcasm Detection (Subtask 1) and Sentiment Analysis (Subtask 2). We used state-of-the-ar t pretrained contextualized text representation models and fine-tuned them according to the downstream task in hand. As a first approach, we used Google's multilingual BERT and then other Arabic variants: AraBERT, ARBERT and MARBERT. The results found show that MARBERT outperforms all of the previously mentioned models overall, either on Subtask 1 or Subtask 2.

تصنيف المعنويات sarcasm and sentiment السخرية والشعور صناعة حمض الفوسفور

Automatic Prosody Generation for Arabic Text- To - Speech Systems

2072 - Damascus University 2011 ورقة بحثية

The main purpose of the present research is to support Arabic Text- to - Speech synthesizers, with natural prosody, based on linguistic analysis of texts to synthesize, and automatic prosody generation, using rules which are deduced from recorded s ignals analysis, of different types of sentences in Arabic. All the types of Arabic sentences (declarative and constructive) were enumerated with the help of an expert in Arabic linguistics . A textual corpus of about 2500 sentences covering most of these types was built and recorded both in natural prosody and without prosody. Later, these sentences were analyzed to extract prosody effect on the signal parameters, and to build prosody generation rules. In this paper, we present the results on negation sentences, applied on synthesized speech using the open source tool MBROLA. The results can be used with any parametric Arabic synthesizer. Future work will apply the rules on a new Arabic synthesizer based on semi-syllables units, which is under development in the Higher Institute for Applied Sciences and Technology.

تركيب الكلام من نصوص للغة العربية موسطات التنغيم قواعد لتوليد التغيم آلياً تحليل لغوي مدونة نصية مدونة كلامية تحليل الإشارة الكلامية Arabic Text To Speech Prosodic Parameters Automatic Prosody Generation Rules Linguistic analysis Text Corpus Speech Corpus Speech Signal Analysis المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Automatic detection of plagiarism in Arabic documents based on lexical chains

كشف حالات الإنتحال في النصوص المدونة باللغة العربية بالإعتماد على السلاسل اللغوية

Ask ChatGPT about the research

Read More

suggested questions