نعرض في هذا البحث المنهجية المعتمدة في بناء منصة ArOntoLearn, و هي بيئة عمل تساعد على بناء أنطولوجية عربية اعتماداً على النصوص في الوب، و أهم سمات هذه البيئة أنها تدعم اللغة العربية و تستخدم المعرفة السابقة في إجرائيات التعلم، فضلاً عن أنها تمثل الأنطولوجية الناتجة باستخدام نموذج الأنطولوجية الاحتمالي (Probabilistic Ontology Model (POM الذي يمكن ترجمته إلى أي صيغة تمثيل للمعرفة. يقوم النظام بتحليل الموارد النصية العربية، يقابلها مع نماذج مفرادتية-نحوية بهدف تعّلم مفاهيم و علاقات جديدة. إن دعم اللغة العربية ليس سهلاً نظراً لكون أدوات المعالجة اللغوية المتوافرة غير فعالة كفاية لمعالجة النصوص العربية غير المشكولة التي كذلك نادراً ما تتضمن علامات الترقيم الصحيحة المساعدة على التحليل الصحيح للجمل. لذلك حاولنا بناء بيئة عمل مرنة يمكن إعدادها بسهولة بحيث تُعدلُ أدوات التحليل المستخدمة فيها و تُستَبدلُ بأخرى أكثر تطوراً عند توافرها.
This paper presents ArOntoLearn, a Framework for Arabic Ontology learning from textual resources.
Supporting Arabic language and using domain knowledge in the learning process are the main features of
our framework. Besides it represents the learned ontology in Probabilistic Ontology Model (POM), which
can be translated into any knowledge representation formalism, and implements data-driven change
discovery. Therefore it updates the POM according to the corpus changes only, and allows user to trace
the evolution of the ontology with respect to the changes in the underlying corpus. Our framework
analyses Arabic textual resources, and matches them to Arabic Lexico-syntactic patterns in order to learn
new Concepts and Relations.
Supporting Arabic language is not that easy task, because current linguistic analysis tools are not efficient
enough to process unvocalized Arabic corpuses that rarely contain appropriate punctuation. So we tried
to build a flexible and freely configured framework whereas any linguistic analysis tool can be replaced by
more sophisticated one whenever it is available.
References used
John.son, C., Fillmore, C., Petruck, M. Baker, C., Ellsworth, M., Ruppenhofer, J., and Wood, E. 2002. FrameNet: Theory and Practice, from http://www.icsi.Berkeley.edu / framenet
Josef Ruppenhofer, MichaelEllsworth, Miriam R. L. Petruck, Christopher R. Johnson, Jan Scheffczyk. "Frame Net II :Extended Theory and Practice", 2006
WordNet. Retrieved June 2009, from http//:www.globalwordnet.org
Biomaterials are synthetic or natural materials used for constructing artificial organs, fabricating prostheses, or replacing tissues. The last century saw the development of thousands of novel biomaterials and, as a result, an exponential increase i
Recent question answering and machine reading benchmarks frequently reduce the task to one of pinpointing spans within a certain text passage that answers the given question. Typically, these systems are not required to actually understand the text o
Psychometric measures of ability, attitudes, perceptions, and beliefs are crucial for understanding user behavior in various contexts including health, security, e-commerce, and finance. Traditionally, psychometric dimensions have been measured and c
Most of the time, when dealing with a particular Natural Language Processing task, systems are compared on the basis of global statistics such as recall, precision, F1-score, etc. While such scores provide a general idea of the behavior of these syst
The ability to search the Web sites has become essential for many people. However many sites have problems in giving the user the needed information. Search operations are typically limited to keyword searches and do not take into consideration the u