Research papers, master and doctoral theses about dialect

138 - Association for Computation Linguistics 2021 مقالة

Building NLP systems that serve everyone requires accounting for dialect differences. But dialects are not monolithic entities: rather, distinctions between and within dialects are captured by the presence, absence, and frequency of dozens of dialect features in speech and text, such as the deletion of the copula in He ∅ running''. In this paper, we introduce the task of dialect feature detection, and present two multitask learning approaches, both based on pretrained transformers. For most dialects, large-scale annotated corpora for these features are unavailable, making it difficult to train recognizers. We train our models on a small number of minimal pairs, building on how linguists typically define dialect features. Evaluation on a test set of 22 dialect features of Indian English demonstrates that these models learn to recognize many features with high accuracy, and that a few minimal pairs can be as effective for training as thousands of labeled examples. We also demonstrate the downstream applicability of dialect feature detection both as a measure of dialect density and as a dialect classifier.

dialect features dialect feature detection ميزات اللهجة لهجة كشف ميزة اللهجة صناعة حمض الفوسفور

Country-level Arabic Dialect Identification using RNNs with and without Linguistic Features

354 - Association for Computation Linguistics 2021 مقالة

This work investigates the value of augmenting recurrent neural networks with feature engineering for the Second Nuanced Arabic Dialect Identification (NADI) Subtask 1.2: Country-level DA identification. We compare the performance of a simple word-le vel LSTM using pretrained embeddings with one enhanced using feature embeddings for engineered linguistic features. Our results show that the addition of explicit features to the LSTM is detrimental to performance. We attribute this performance loss to the bivalency of some linguistic items in some text, ubiquity of topics, and participant mobility.

منطقيا عربي country-level arabic dialect لهجة عربية على مستوى البلد صناعة حمض الفوسفور

Naive Bayes-based Experiments in Romanian Dialect Identification

222 - Association for Computation Linguistics 2021 مقالة

This article describes the experiments and systems developed by the SUKI team for the second edition of the Romanian Dialect Identification (RDI) shared task which was organized as part of the 2021 VarDial Evaluation Campaign. We submitted two runs t o the shared task and our second submission was the overall best submission by a noticeable margin. Our best submission used a character n-gram based naive Bayes classifier with adaptive language models. We describe our experiments on the development set leading to both submissions.

romanian dialect identification dialect identification romanian dialect الهوية الرومانية الهوية تحديد الهياكل لهجة رومانية صناعة حمض الفوسفور المزيد..

NADI 2021: The Second Nuanced Arabic Dialect Identification Shared Task

296 - Association for Computation Linguistics 2021 مقالة

We present the findings and results of theSecond Nuanced Arabic Dialect IdentificationShared Task (NADI 2021). This Shared Taskincludes four subtasks: country-level ModernStandard Arabic (MSA) identification (Subtask1.1), country-level dialect identi fication (Subtask1.2), province-level MSA identification (Subtask2.1), and province-level sub-dialect identifica-tion (Subtask 2.2). The shared task dataset cov-ers a total of 100 provinces from 21 Arab coun-tries, collected from the Twitter domain. A totalof 53 teams from 23 countries registered to par-ticipate in the tasks, thus reflecting the interestof the community in this area. We received 16submissions for Subtask 1.1 from five teams, 27submissions for Subtask 1.2 from eight teams,12 submissions for Subtask 2.1 from four teams,and 13 Submissions for subtask 2.2 from fourteams.

nuanced arabic dialect nuanced arabic arabic dialect identification لهجة عربية دقيقة nuanced العربية الهوية العربية الهوية صناعة حمض الفوسفور المزيد..

" The Third Language 'in-between' in the Novels of Ahmed Youssef Dawood"

1181 - Tishreen University 2018 ورقة بحثية

Language is one means of communication that has the most significant role in enhancing humans' life and their relation with their environment alongside their relations with the society in which they were born and raised. Language has always been th e product of this society on whose progress and regress have an impact upon it. It is well-known that standard Arabic is the official language with its accurate grammar and vocabulary moving from the ancestor to the descendant. However, it very often may be difficult to apply or have access to for most people regardless of their cultural qualifications. It is also difficult for this language to convey or transfer reality as clear as it is or to express how easy and spontaneous life is to all people. Since the phenomenon of vernacular language alongside standard language is a linguistic one all over the world, thus the necessity in the Arabic novel in general and countryside in particular emerged to have an in-between third language that is neither standard nor vernacular. This novel language is to be capable of bringing the standard closer to daily life and ending up with one form of dialogue that provides characters with their psychological and social traits; a tacit language for all different cultural and scientific levels of readers and their social status. Also, this language will help the text express the human emotions that emerge subconsciously for the standard one is incapable of doing so. Needless to say, standard Arabic was one day a vernacular with different dialects expressed through words like "language" and "tongue." Allah said: ("We have not sent but a messenger to represent his nation and clarify the truth to them. For, God guide and misguide whomsoever thus He is the Noble and Wise").

language اللغة standard اللهجة الفصحى العامية أحمد داوود الوسطى (الثالثة) vernacular dialect Ahmad Dawood third language (in-between) المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد