The proliferation of fake news is a current issue that influences a number of important areas of society, such as politics, economy and health. In the Natural Language Processing area, recent initiatives tried to detect fake news in different ways, r
anging from language-based approaches to content-based verification. In such approaches, the choice of the features for the classification of fake and true news is one of the most important parts of the process. This paper presents a study on the impact of readability features to detect fake news for the Brazilian Portuguese language. The results show that such features are relevant to the task (achieving, alone, up to 92% classification accuracy) and may improve previous classification results.
This research proposes a new way to improve the
search outcome of Arabic semantics by abstractly summarizing the
Arabic texts (Abstractive Summary) using natural language
processing algorithms(NLP),Word Sense Disambiguation (WSD)
and techniques o
f measuring Semantic Similarity in Arabic WordNet
Ontology.
معالجة اللغات الطبيعية
Semantic analysis
استرجاع المعلومات
التلخيص التجريدي
الأنتولوجيا العربية ووردنت
العلاقة الدلالية المفاهيمية
التشابهية الدلالية
التحليل الدلالي
حل غموض معاني الكلمات
(Natural Language Processing (NLP
(Information Retrieval (IR
Abstractive Summarization
(Arabic WordNet (AWN
Conceptual Semantic Relation
Semantic Similarity
(Word Sense Disambiguation (WSD
المزيد..
This paper presents a reference study of available algorithms for plagiarism
detection and it develops semantic plagiarism detection algorithm for plagiarism detection
in medical research papers by employing the Medical Ontologies available on the
World
Wide Web.
The issue of plagiarism detection in medical research written in natural languages is
a complex issue and related exact domain of medical research.
There are many used algorithms for plagiarism detection in natural language, which
are generally divided into two main categories, the first one is comparison algorithms
between files by using fingerprints of files, and files content comparison algorithms, which
include strings matching algorithms and text and tree matching algorithms.
Recently a lot of research in the field of semantic plagiarism detection algorithms
and semantic plagiarism detection algorithms were developed basing of citation analysis
models in scientific research.
In this research a system for plagiarism detection was developed using “Bing” search
engine, where tow type of ontologies used in this system, public ontology as wordNet and
many standard international ontologies in medical domain as Diseases ontology which
contains a descriptions about diseases and definitions of it and the derivation between
diseases.
This paper presents a review of available algorithms and plagiarism detection systems، and an
implementation of Plagiarism Detection System using available search engines on the web.
Plagiarism detection in natural language documents is a complicat
ed problem and it is related to the
characteristics of the language itself.
There are many available algorithms for plagiarism detection in natural languages .Generally these
algorithms belong to two main categories ; the first one is plagiarism detection algorithms based on
fingerprint and the second is plagiarism detection algorithms based on content comparison and includes
string matching and tree matching algorithms .
Usually available systems of plagiarism detection use specific type of detection algorithms or use a
mixture of detection algorithms to achieve effective detection systems (fast and accurate).
In this research, a plagiarism detection system has been developed using Bing search engine and a
plagiarism detection algorithm based on Rhetorical Structure Theory.
Lexicon plays an essential role in natural language processing systems and
specially the machine translation systems, because it provides the system's
components with the necessary information for the translation process. Although there have been a number of researches in natural language processing field, not enough attention has been given to the importance of the lexicon and specially the Arabic lexicon.
Proofreading is the process of checking text to detect spelling, grammatical, and semantic errors in order to correct them, proofreading the grammar and the meaning of the natural languages is considered one of the basic objectives for people who a
re interested in computational linguistics, because it becomes necessary for checking written text on the computers in multiple areas, such as proofreading emails and texts on the websites pages, it is also essential for proofreading scientific articles and researches, and it can be used to correct students' answers in the traditional e-learning exams. In addition to that the manual correction process of students' answers in the traditional way is expensive in terms of time and effort, sometimes it is error prone, and it becomes more difficult when there are large number of students, so the automatic correction process is an important step to save time and effort and it avoids errors during correcting answers in the traditional way. This research presents the stages of building Automatic Content Verification Compiler. It presents the stages of building a system which is interested in English syntax check, and it displays the stages of the lexical analysis which is considered a first stage to execute the syntax analysis, in addition to that it shows the stages of executing the syntax analysis which builds the grammatical model, this model describes the simple sentences in English, the study depends on studying grammatical structure in English, then it suggests suitable parts of this model, and it presents an application which verifies English sentences and draws derivation tree of these sentences.
This paper presents ArOntoLearn, a Framework for Arabic Ontology learning from textual resources.
Supporting Arabic language and using domain knowledge in the learning process are the main features of
our framework. Besides it represents the learne
d ontology in Probabilistic Ontology Model (POM), which
can be translated into any knowledge representation formalism, and implements data-driven change
discovery. Therefore it updates the POM according to the corpus changes only, and allows user to trace
the evolution of the ontology with respect to the changes in the underlying corpus. Our framework
analyses Arabic textual resources, and matches them to Arabic Lexico-syntactic patterns in order to learn
new Concepts and Relations.
Supporting Arabic language is not that easy task, because current linguistic analysis tools are not efficient
enough to process unvocalized Arabic corpuses that rarely contain appropriate punctuation. So we tried
to build a flexible and freely configured framework whereas any linguistic analysis tool can be replaced by
more sophisticated one whenever it is available.
Morphological analysis is an important step in natural language processing and its
various applications. Each kind of these applications needs a certain balance between:
performance, accuracy, and generality of solutions (i.e. getting all possible
roots); while
we focus on performance with a good accuracy in Information retrieval applications,
we try to achieve high accuracy in systems like pos-tagger and machine translation, and
both high accuracy and high generality in systems like language learning systems and
Arabic lexical dictionaries. In this paper, we describe our approach to build a flexible
and application oriented Arabic morphological analyzer; this approach is designed to
satisfy various requirements of most applications which need morphological processing.
It also provides a separate stage (Original Letters Detection Algorithm) which can be
plugged easily in any Other morphological analyzer to improve its performance, and
with no negative effect on its reliability.
Statistical approaches to processing natural language text have become dominant in recent years. It provides broad but rigorous coverage of mathematical and linguistic foundations, as well as detailed discussion of statistical methods, allowing students and researchers to construct their own implementations.