معظم اللغات المفقودة التي لا تزال غير المفقودة تظهر خصائيتين تشكل تحديات فك شفرة كبيرة: (1) لا يتم تجزئة النصوص بالكامل في الكلمات؛ (2) لا يتم تحديد أقرب لغة معروفة. نقترح نموذج فك تشفير يعالج كل من هذه التحديات من خلال بناء القيود اللغوية الغنية التي تعكس أنماط ثابتة في تغيير الصوت التاريخي. نلقي التقاط الهندسة الصوتيات الطبيعية عن طريق التعلم Admanes Admingdings بناء على الأبجدية الصوتية الدولية (IPA). الإطار الولادة الناتج الناتج نماذج تجزئة الكلمات والمعالجة، على علم بالقيود الصوتية. نقيم النموذج على كل من اللغات المعتمدة (القوطية، Ugaritic) وواحدة غير ملائمة (iberian). تظهر التجارب أن دمج الهندسة الصوتية يؤدي إلى مكاسب واضحة ومتسقة. بالإضافة إلى ذلك، نقترح قياس التقارب اللغوي الذي يحدد بشكل صحيح اللغات ذات الصلة القوطية و Ugaritic. بالنسبة إلى Iberian، لا تظهر الطريقة أدلة قوية تدعم لغة الباسك بلغة ذات صلة، متفق عليها بالموقف المفضل من قبل المنح الدراسية الحالية
Most undeciphered lost languages exhibit two characteristics that pose significant decipherment challenges: (1) the scripts are not fully segmented into words; (2) the closest known language is not determined. We propose a decipherment model that handles both of these challenges by building on rich linguistic constraints reflecting consistent patterns in historical sound change. We capture the natural phonological geometry by learning character embeddings based on the International Phonetic Alphabet (IPA). The resulting generative framework jointly models word segmentation and cognate alignment, informed by phonological constraints. We evaluate the model on both deciphered languages (Gothic, Ugaritic) and an undeciphered one (Iberian). The experiments show that incorporating phonetic geometry leads to clear and consistent gains. Additionally, we propose a measure for language closeness which correctly identifies related languages for Gothic and Ugaritic. For Iberian, the method does not show strong evidence supporting Basque as a related language, concurring with the favored position by the current scholarship.1
References used
Scripts -- prototypical event sequences describing everyday activities -- have been shown to help understand narratives by providing expectations, resolving ambiguity, and filling in unstated information. However, to date they have proved hard to aut
In this paper, we introduce the task of predicting severity of age-restricted aspects of movie content based solely on the dialogue script. We first investigate categorizing the ordinal severity of movies on 5 aspects: Sex, Violence, Profanity, Subst
The aim of the study is repairing some partially damaged
archeological wood and restoration it before being completely
damaged. For achieving this aim, the radiation technology was used in preparing some polymer blends such as Poly (vinyl alcohol)
Narrative generation is an open-ended NLP task in which a model generates a story given a prompt. The task is similar to neural response generation for chatbots; however, innovations in response generation are often not applied to narrative generatio
In this paper, we aim to address the challenges surrounding the translation of ancient Chinese text: (1) The linguistic gap due to the difference in eras results in translations that are poor in quality, and (2) most translations are missing the cont