New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Towards a system for constructing Arabic Ontology based on natural text

بناء نواة نظام مساعد على إنشاء أنطولوجية عربية انطلاقاً من النصوص

2003 0 55 0 ( 0 )

Download Cite

Added by Damascus University ورقة بحثية

Publication date 2011

and research's language is العربية

Authors ندى غنيم( باحث ) - وسيم صافي( باحث ) - مؤيد السيد علي( باحث )

Created by Shamra Editor

Ontology تعلم الأنطولوجية المعالجة الآلية للغة العربية استحصال المعرفة نموذج الأنطولوجية الاحتمالي Ontology Learning Knowledge Acquisition Arabic Natural Language Processing

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper presents ArOntoLearn, a Framework for Arabic Ontology learning from textual resources. Supporting Arabic language and using domain knowledge in the learning process are the main features of our framework. Besides it represents the learned ontology in Probabilistic Ontology Model (POM), which can be translated into any knowledge representation formalism, and implements data-driven change discovery. Therefore it updates the POM according to the corpus changes only, and allows user to trace the evolution of the ontology with respect to the changes in the underlying corpus. Our framework analyses Arabic textual resources, and matches them to Arabic Lexico-syntactic patterns in order to learn new Concepts and Relations. Supporting Arabic language is not that easy task, because current linguistic analysis tools are not efficient enough to process unvocalized Arabic corpuses that rarely contain appropriate punctuation. So we tried to build a flexible and freely configured framework whereas any linguistic analysis tool can be replaced by more sophisticated one whenever it is available.

References used

John.son, C., Fillmore, C., Petruck, M. Baker, C., Ellsworth, M., Ruppenhofer, J., and Wood, E. 2002. FrameNet: Theory and Practice, from http://www.icsi.Berkeley.edu / framenet

Josef Ruppenhofer, MichaelEllsworth, Miriam R. L. Petruck, Christopher R. Johnson, Jan Scheffczyk. "Frame Net II :Extended Theory and Practice", 2006

WordNet. Retrieved June 2009, from http//:www.globalwordnet.org

rate research

The Biomaterials Annotator: a system for ontology-based concept annotation of biomaterials text

332 - Association for Computation Linguistics 2021 مقالة

Biomaterials are synthetic or natural materials used for constructing artificial organs, fabricating prostheses, or replacing tissues. The last century saw the development of thousands of novel biomaterials and, as a result, an exponential increase i n scientific publications in the field. Large-scale analysis of biomaterials and their performance could enable data-driven material selection and implant design. However, such analysis requires identification and organization of concepts, such as materials and structures, from published texts. To facilitate future information extraction and the application of machine-learning techniques, we developed a semantic annotator specifically tailored for the biomaterials literature. The Biomaterials Annotator has been implemented following a modular organization using software containers for the different components and orchestrated using Nextflow as workflow manager. Natural language processing (NLP) components are mainly developed in Java. This set-up has allowed named entity recognition of seventeen classes relevant to the biomaterials domain. Here we detail the development, evaluation and performance of the system, as well as the release of the first collection of annotated biomaterials abstracts. We make both the corpus and system available to the community to promote future efforts in the field and contribute towards its sustainability.

ontology-based concept annotation biomaterials biomaterials annotator التوضيحية المفهوم القائم على OnTology المواد الحيوية المواد الحيوية Annotator. صناعة حمض الفوسفور المزيد..

BiQuAD: Towards QA based on deeper text understanding

299 - Association for Computation Linguistics 2021 مقالة

Recent question answering and machine reading benchmarks frequently reduce the task to one of pinpointing spans within a certain text passage that answers the given question. Typically, these systems are not required to actually understand the text o n a deeper level that allows for more complex reasoning on the information contained. We introduce a new dataset called BiQuAD that requires deeper comprehension in order to answer questions in both extractive and deductive fashion. The dataset consist of 4,190 closed-domain texts and a total of 99,149 question-answer pairs. The texts are synthetically generated soccer match reports that verbalize the main events of each match. All texts are accompanied by a structured Datalog program that represents a (logical) model of its information. We show that state-of-the-art QA models do not perform well on the challenging long form contexts and reasoning requirements posed by the dataset. In particular, transformer based state-of-the-art models achieve F1-scores of only 39.0. We demonstrate how these synthetic datasets align structured knowledge with natural text and aid model introspection when approaching complex text understanding.

تعليم القواعد text understanding deeper فهم النص أعمق صناعة حمض الفوسفور

Constructing a Psychometric Testbed for Fair Natural Language Processing

279 - Association for Computation Linguistics 2021 مقالة

Psychometric measures of ability, attitudes, perceptions, and beliefs are crucial for understanding user behavior in various contexts including health, security, e-commerce, and finance. Traditionally, psychometric dimensions have been measured and c ollected using survey-based methods. Inferring such constructs from user-generated text could allow timely, unobtrusive collection and analysis. In this paper we describe our efforts to construct a corpus for psychometric natural language processing (NLP) related to important dimensions such as trust, anxiety, numeracy, and literacy, in the health domain. We discuss our multi-step process to align user text with their survey-based response items and provide an overview of the resulting testbed which encompasses survey-based psychometric measures and accompanying user-generated text from 8,502 respondents. Our testbed also encompasses self-reported demographic information, including race, sex, age, income, and education - thereby affording opportunities for measuring bias and benchmarking fairness of text classification methods. We report preliminary results on use of the text to predict/categorize users' survey response labels - and on the fairness of these models. We also discuss the important implications of our work and resulting testbed for future NLP research on psychometrics and fairness.

fair natural language fair natural لغة طبيعية عادلة عادلة طبيعية صناعة حمض الفوسفور

Differential Evaluation: a Qualitative Analysis of Natural Language Processing System Behavior Based Upon Data Resistance to Processing

339 - Association for Computation Linguistics 2021 مقالة

Most of the time, when dealing with a particular Natural Language Processing task, systems are compared on the basis of global statistics such as recall, precision, F1-score, etc. While such scores provide a general idea of the behavior of these syst ems, they ignore a key piece of information that can be useful for assessing progress and discerning remaining challenges: the relative difficulty of test instances. To address this shortcoming, we introduce the notion of differential evaluation which effectively defines a pragmatic partition of instances into gradually more difficult bins by leveraging the predictions made by a set of systems. Comparing systems along these difficulty bins enables us to produce a finer-grained analysis of their relative merits, which we illustrate on two use-cases: a comparison of systems participating in a multi-label text classification task (CLEF eHealth 2018 ICD-10 coding), and a comparison of neural models trained for biomedical entity detection (BioCreative V chemical-disease relations dataset).

التحقق والتحقق language processing system language processing task نظام معالجة اللغة صناعة حمض الفوسفور

Using Open Sources for Developing Arabic Ontology

3328 - Tishreen University 2013 ورقة بحثية

The ability to search the Web sites has become essential for many people. However many sites have problems in giving the user the needed information. Search operations are typically limited to keyword searches and do not take into consideration the u nderlying semantics of the content.The present technologies support most languages; Though Arabic is still not well supported. One of the main application areas of Ontology technology is semantics. Although there are many tools for developing Ontology’s in many languages, Arabic WordNet seems to be the only one that supports Arabic language. In this paper we will define the necessary steps to develop Arabic Ontology for university sites using Arabic WordNet, and check that the developed Ontology is clean.

Ontology Open Source semantic Arabic WordNet الأنطولجيا المصادر المفتوحة علم دلالات الألفاظ وورد نت عربي المزيد..

comments

Fetching comments

Syrian Virtual University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Towards a system for constructing Arabic Ontology based on natural text

بناء نواة نظام مساعد على إنشاء أنطولوجية عربية انطلاقاً من النصوص

Ask ChatGPT about the research

Read More