New community

Subscribe to the gold package and get unlimited access to Shamra Academy

English-Arabic Cross-language Plagiarism Detection

انتشار الانتحال باللغة الإنجليزية والعربية

482 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

cross-language plagiarism detection english-arabic cross-language plagiarism cross-language plagiarism الكشف عن الانتحال باللغة عبر اللغة الانتحال الإنجليزية والعربية الانتحال عبر اللغة صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The advancement of the web and information technology has contributed to the rapid growth of digital libraries and automatic machine translation tools which easily translate texts from one language into another. These have increased the content accessible in different languages, which results in easily performing translated plagiarism, which are referred to as cross-language plagiarism''. Recognition of plagiarism among texts in different languages is more challenging than identifying plagiarism within a corpus written in the same language. This paper proposes a new technique for enhancing English-Arabic cross-language plagiarism detection at the sentence level. This technique is based on semantic and syntactic feature extraction using word order, word embedding and word alignment with multilingual encoders. Those features, and their combination with different machine learning (ML) algorithms, are then used in order to aid the task of classifying sentences as either plagiarized or non-plagiarized. The proposed approach has been deployed and assessed using datasets presented at SemEval-2017. Analysis of experimental data demonstrates that utilizing extracted features and their combinations with various ML classifiers achieves promising results.

References used

https://aclanthology.org/

rate research

Plagiarism Detection in Arabic Language using Rhetorical Structure Theory

2218 - Damascus University 2014 ورقة بحثية

This paper presents a review of available algorithms and plagiarism detection systems، and an implementation of Plagiarism Detection System using available search engines on the web. Plagiarism detection in natural language documents is a complicat ed problem and it is related to the characteristics of the language itself. There are many available algorithms for plagiarism detection in natural languages .Generally these algorithms belong to two main categories ; the first one is plagiarism detection algorithms based on fingerprint and the second is plagiarism detection algorithms based on content comparison and includes string matching and tree matching algorithms . Usually available systems of plagiarism detection use specific type of detection algorithms or use a mixture of detection algorithms to achieve effective detection systems (fast and accurate). In this research, a plagiarism detection system has been developed using Bing search engine and a plagiarism detection algorithm based on Rhetorical Structure Theory.

كشف الانتحال في اللغة العربية نظرية بنية البكلام البلاغية معالجة اللغات الطبيعية Arabic plagiarism detection Rhetorical Structure Theory Natural language processing

HateBERT: Retraining BERT for Abusive Language Detection in English

665 - Association for Computation Linguistics 2021 مقالة

We introduce HateBERT, a re-trained BERT model for abusive language detection in English. The model was trained on RAL-E, a large-scale dataset of Reddit comments in English from communities banned for being offensive, abusive, or hateful that we hav e curated and made available to the public. We present the results of a detailed comparison between a general pre-trained language model and the retrained version on three English datasets for offensive, abusive language and hate speech detection tasks. In all datasets, HateBERT outperforms the corresponding general BERT model. We also discuss a battery of experiments comparing the portability of the fine-tuned models across the datasets, suggesting that portability is affected by compatibility of the annotated phenomena.

abusive language detection retraining bert abusive language الكشف عن اللغة المسيئة إعادة تدريب بيرت لغة مسيئة صناعة حمض الفوسفور المزيد..

Survey Of Traditional And Semantic Plagiarism Detection Algorithms

1801 - Tishreen University 2016 ورقة بحثية

In this paper we review and list, the advantages and limitations of the significant effective techniques employed or developed in text plagiarism detection. It was found that many of the proposed methods for plagiarism detection have a weakness poi nts and do not detect some types of plagiarized operations. This paper show a survey about plagiarism detection including several important subjects in plagiarism detection, which is plagiarism definition, plagiarism prevention and detection, plagiarism detection systems, plagiarism detection processes and some of the current plagiarism detection techniques. This paper compares between different plagiarism detection algorithms, and shows the points of weakness, and points of efficiency, and describe the power of semantic plagiarism detection methods, and shows its efficiency in detect plagiarism cases that another plagiarism detection algorithms don’t able to detect these cases, that semantic plagiarism detection methods are developed to get rid of traditional weakness points for all plagiarism detection methods have.

خوارزميات كشف الانتحال الدلالية عملية كشف الانتحال تقنيات كشف الانتحال Semantic Plagiarism Detection algorithms Detection Process Detection Techniques

Plagiarism Detection in Medical Research Using Medical Ontology

1862 - Tishreen University 2016 ورقة بحثية

This paper presents a reference study of available algorithms for plagiarism detection and it develops semantic plagiarism detection algorithm for plagiarism detection in medical research papers by employing the Medical Ontologies available on the World Wide Web. The issue of plagiarism detection in medical research written in natural languages is a complex issue and related exact domain of medical research. There are many used algorithms for plagiarism detection in natural language, which are generally divided into two main categories, the first one is comparison algorithms between files by using fingerprints of files, and files content comparison algorithms, which include strings matching algorithms and text and tree matching algorithms. Recently a lot of research in the field of semantic plagiarism detection algorithms and semantic plagiarism detection algorithms were developed basing of citation analysis models in scientific research. In this research a system for plagiarism detection was developed using “Bing” search engine, where tow type of ontologies used in this system, public ontology as wordNet and many standard international ontologies in medical domain as Diseases ontology which contains a descriptions about diseases and definitions of it and the derivation between diseases.

معالجة اللغات الطبيعية Natural language processing semantic web الوب الدلالي كشف الانتحال الأنطولوجيات الطبية plagiarism detection medical ontologies المزيد..

Automatic detection of plagiarism in Arabic documents based on lexical chains

849 - جامعة صفاقس 2011 ورقة بحثية

This paper deals with automatic detection of plagiarism in Arabic documents. We present in this paper a new idea based on the experimentation of lexical chains. The proposed method extracts those chains from original document and uses a search engine to verify if such chains occur in other documents. The second step in our methods uses automatic translation system to translate lexical chains and verify by using search engine if those chain occurs in document in other languages. Then we compute a correlation ratio between lexical chains and lexical chains extracted from documents provided by the search engine to detect plagiarism in the original document. We present in the end of this paper our prototype called « Alkachef » developed to detect plagiarism in Arabic document .

معالجة اللغات الطبيعية كشف الانتحال الانتحال العلمي الكشف الآلي للإنتحال السلاسل اللغوية

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

English-Arabic Cross-language Plagiarism Detection

انتشار الانتحال باللغة الإنجليزية والعربية

Ask ChatGPT about the research

Read More

suggested questions