ﻻ يوجد ملخص باللغة العربية
Spell checking and morphological analysis are two fundamental tasks in text and natural language processing and are addressed in the early stages of the development of language technology. Despite the previous efforts, there is no progress in open-source to create such tools for Sorani Kurdish, also known as Central Kurdish, as a less-resourced language. In this paper, we present our efforts in annotating a lexicon with morphosyntactic tags and also, extracting morphological rules of Sorani Kurdish to build a morphological analyzer, a stemmer and a spell-checking system using Hunspell. This implementation can be used for further developments in the field by researchers and also, be integrated into text editors under a publicly available license.
The present paper aims at presenting a lemmatization and a word-level error correction system for Sorani Kurdish. We propose a hybrid approach based on the morphological rules and a n-gram language model. We have called our lemmatization and error co
Sorani Kurdish, also known as Central Kurdish, has a complex morphology, particularly due to the patterns in which morphemes appear. Although several aspects of Kurdish morphology have been studied, such as pronominal endoclitics and Izafa constructi
We present an experimental dataset, Basic Dataset for Sorani Kurdish Automatic Speech Recognition (BD-4SK-ASR), which we used in the first attempt in developing an automatic speech recognition for Sorani Kurdish. The objective of the project was to d
Segmentation is a fundamental step for most Natural Language Processing tasks. The Kurdish language is a multi-dialect, under-resourced language which is written in different scripts. The lack of various segmented corpora is one of the major bottlene
Chinese Spell Checking (CSC) aims to detect and correct erroneous characters for user-generated text in the Chinese language. Most of the Chinese spelling errors are misused semantically, phonetically or graphically similar characters. Previous attem