Analysis of the Arabic Broken Plural and Diminutive

43 0 0.0 ( 0 )

Download Cite

Added by George Kiraz

Publication date 1995

fields Informatics Engineering

and research's language is English

Authors George A. Kiraz

Computation and Language

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper demonstrates how the challenging problem of the Arabic broken plural and diminutive can be handled under a multi-tape two-level model, an extension to two-level morphology.

rate research

Computational Analyses of Arabic Morphology

52 - George A. Kiraz 1994

This paper demonstrates how a (multi-tape) two-level formalism can be used to write two-level grammars for Arabic non-linear morphology using a high level, but computationally tractable, notation. Three illustrative grammars are provided based on CV-, moraic- and affixational analyses. These are complemented by a proposal for handling the hitherto computationally untreated problem of the broken plural. It will be shown that the best grammars for describing Arabic non-linear morphology are moraic in the case of templatic stems, and affixational in the case of a-templatic stems. The paper will demonstrate how the broken plural can be derived under two-level theory via the `implicit derivation of the singular.

Computation and Language

An Enhanced Corpus for Arabic Newspapers Comments

421 - Hichem Rahab , Abdelhafid Zitouni , Mahieddine Djoudi 2021

In this paper, we propose our enhanced approach to create a dedicated corpus for Algerian Arabic newspapers comments. The developed approach has to enhance an existing approach by the enrichment of the available corpus and the inclusion of the annotation step by following the Model Annotate Train Test Evaluate Revise (MATTER) approach. A corpus is created by collecting comments from web sites of three well know Algerian newspapers. Three classifiers, support vector machines, na{i}ve Bayes, and k-nearest neighbors, were used for classification of comments into positive and negative classes. To identify the influence of the stemming in the obtained results, the classification was tested with and without stemming. Obtained results show that stemming does not enhance considerably the classification due to the nature of Algerian comments tied to Algerian Arabic Dialect. The promising results constitute a motivation for us to improve our approach especially in dealing with non Arabic sentences, especially Dialectal and French ones.

Information Retrieval Computation and Language Multiagent Systems

Free the Plural: Unrestricted Split-Antecedent Anaphora Resolution

188 - Juntao Yu , Nafise Sadat Moosavi , Silviu Paun 2020

Now that the performance of coreference resolvers on the simpler forms of anaphoric reference has greatly improved, more attention is devoted to more complex aspects of anaphora. One limitation of virtually all coreference resolution models is the focus on single-antecedent anaphors. Plural anaphors with multiple antecedents-so-called split-antecedent anaphors (as in John met Mary. They went to the movies) have not been widely studied, because they are not annotated in ONTONOTES and are relatively infrequent in other corpora. In this paper, we introduce the first model for unrestricted resolution of split-antecedent anaphors. We start with a strong baseline enhanced by BERT embeddings, and show that we can substantially improve its performance by addressing the sparsity issue. To do this, we experiment with auxiliary corpora where split-antecedent anaphors were annotated by the crowd, and with transfer learning models using element-of bridging references and single-antecedent coreference as auxiliary tasks. Evaluation on the gold annotated ARRAU corpus shows that the out best model uses a combination of three auxiliary corpora achieved F1 scores of 70% and 43.6% when evaluated in a lenient and strict setting, respectively, i.e., 11 and 21 percentage points gain when compared with our baseline.

Computation and Language

CITlab ARGUS for Arabic Handwriting

106 - Gundram Leifert , Roger Labahn , Tobias Strau{ss} 2014

In the recent years it turned out that multidimensional recurrent neural networks (MDRNN) perform very well for offline handwriting recognition tasks like the OpenHaRT 2013 evaluation DIR. With suitable writing preprocessing and dictionary lookup, our ARGUS software completed this task with an error rate of 26.27% in its primary setup.

Computer Vision and Pattern Recognition Neural and Evolutionary Computing

Neural Coreference Resolution for Arabic

119 - Abdulrahman Aloraini , Juntao Yu , Massimo Poesio 2020

No neural coreference resolver for Arabic exists, in fact we are not aware of any learning-based coreference resolver for Arabic since (Bjorkelund and Kuhn, 2014). In this paper, we introduce a coreference resolution system for Arabic based on Lee et als end to end architecture combined with the Arabic version of bert and an external mention detector. As far as we know, this is the first neural coreference resolution system aimed specifically to Arabic, and it substantially outperforms the existing state of the art on OntoNotes 5.0 with a gain of 15.2 points conll F1. We also discuss the current limitations of the task for Arabic and possible approaches that can tackle these challenges.

Computation and Language

comments

Fetching comments

Tartous University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Analysis of the Arabic Broken Plural and Diminutive

Ask ChatGPT about the research

No Arabic abstract

This paper demonstrates how the challenging problem of the Arabic broken plural and diminutive can be handled under a multi-tape two-level model, an extension to two-level morphology.

Read More