We aimed to distinguish between them and the other research areas such as information retrieval and data mining. we tried to determine the general structure of such systems which form a part of larger systems that have a mission to answer user querie
s based on the extracted information. we reviewed the different types of these systems, used techniques with them and tried to define the current and future challenges and the consequent research problems.
Finally we tried to discuss the details of the various
implementations of these systems by explaining two platforms Gate and OpenCalais and comparing between their information
extraction systems and discuss the results.
In this paper, we introduce an algorithm for grouping Arabic
documents for building an ontology and its words. We execute
the algorithm on five ontologies using Java. We manage the
documents by getting 338667 words with its weights
corresponding
to each ontology. The algorithm had proved its
efficiency in optimizing classifiers (SVM, NB) performance, which
we tested in this study, comparing with former classifiers results
for Arabic language.
Semantic Web is a new revolution in the world of the Web, where information and
data become viable for logical processing by computer programs. Where they are
transformed into meaningful data network. Although Semantic Web is considered the
future
of World Wide Web, the Arabic research and studies are still relatively rare in this
field. Therefore, this paper gives a reference study of Semantic Web and the different
methods to explore the knowledge and discover useful information from the vast amount
of data provided by the web. It gives a programming example like application of some of
these techniques provided by the Semantic Web and methods to discover the knowledge of
it. This simplified programming example provides services related to higher education
Syrian government, such as information about the Syrian public universities like the name
of the university (Syrian Virtual University, Tishreen, Aleppo, Damascus, and Al Baath),
address of the university, its web site, number of students and a summary of the university,
which helps intelligent agents to find those services dynamically.
This paper presents a reference study of available algorithms for plagiarism
detection and it develops semantic plagiarism detection algorithm for plagiarism detection
in medical research papers by employing the Medical Ontologies available on the
World
Wide Web.
The issue of plagiarism detection in medical research written in natural languages is
a complex issue and related exact domain of medical research.
There are many used algorithms for plagiarism detection in natural language, which
are generally divided into two main categories, the first one is comparison algorithms
between files by using fingerprints of files, and files content comparison algorithms, which
include strings matching algorithms and text and tree matching algorithms.
Recently a lot of research in the field of semantic plagiarism detection algorithms
and semantic plagiarism detection algorithms were developed basing of citation analysis
models in scientific research.
In this research a system for plagiarism detection was developed using “Bing” search
engine, where tow type of ontologies used in this system, public ontology as wordNet and
many standard international ontologies in medical domain as Diseases ontology which
contains a descriptions about diseases and definitions of it and the derivation between
diseases.
In the few recent years, besides the traditional web a new web has appeared. It is
called the Web of Linked Data. It has been developed to present data in a machinereadable
form. The main idea is to describe data using a set of terms called web ont
ology.
At this time, tools and standards related to the semantic web are becoming comprehensive
and stable; however, publishing university data as linked data still faces some major
challenges. First of all, there is no unified, well-accepted vocabulary for describing
university-related information.
This article aims to find the ontology which could be used to describe the data in the
university domain, so it could be possible to integrate this data with data from other
universities and do queries on it. The web ontology was built by reusing the available
vocabularies on the web and adding new classes and properties. The ontology has been
organized by using Protégé.