Enabling empathetic behavior in Arabic dialogue agents is an important aspect of building human-like conversational models. While Arabic Natural Language Processing has seen significant advances in Natural Language Understanding (NLU) with language m
odels such as AraBERT, Natural Language Generation (NLG) remains a challenge. The shortcomings of NLG encoder-decoder models are primarily due to the lack of Arabic datasets suitable to train NLG models such as conversational agents. To overcome this issue, we propose a transformer-based encoder-decoder initialized with AraBERT parameters. By initializing the weights of the encoder and decoder with AraBERT pre-trained weights, our model was able to leverage knowledge transfer and boost performance in response generation. To enable empathy in our conversational model, we train it using the ArabicEmpatheticDialogues dataset and achieve high performance in empathetic response generation. Specifically, our model achieved a low perplexity value of 17.0 and an increase in 5 BLEU points compared to the previous state-of-the-art model. Also, our proposed model was rated highly by 85 human evaluators, validating its high capability in exhibiting empathy while generating relevant and fluent responses in open-domain settings.
This paper presents ArOntoLearn, a Framework for Arabic Ontology learning from textual resources.
Supporting Arabic language and using domain knowledge in the learning process are the main features of
our framework. Besides it represents the learne
d ontology in Probabilistic Ontology Model (POM), which
can be translated into any knowledge representation formalism, and implements data-driven change
discovery. Therefore it updates the POM according to the corpus changes only, and allows user to trace
the evolution of the ontology with respect to the changes in the underlying corpus. Our framework
analyses Arabic textual resources, and matches them to Arabic Lexico-syntactic patterns in order to learn
new Concepts and Relations.
Supporting Arabic language is not that easy task, because current linguistic analysis tools are not efficient
enough to process unvocalized Arabic corpuses that rarely contain appropriate punctuation. So we tried
to build a flexible and freely configured framework whereas any linguistic analysis tool can be replaced by
more sophisticated one whenever it is available.
Morphological analysis is an important step in natural language processing and its
various applications. Each kind of these applications needs a certain balance between:
performance, accuracy, and generality of solutions (i.e. getting all possible
roots); while
we focus on performance with a good accuracy in Information retrieval applications,
we try to achieve high accuracy in systems like pos-tagger and machine translation, and
both high accuracy and high generality in systems like language learning systems and
Arabic lexical dictionaries. In this paper, we describe our approach to build a flexible
and application oriented Arabic morphological analyzer; this approach is designed to
satisfy various requirements of most applications which need morphological processing.
It also provides a separate stage (Original Letters Detection Algorithm) which can be
plugged easily in any Other morphological analyzer to improve its performance, and
with no negative effect on its reliability.