الاختلاف الدلالي بلغات ذات صلة هو مصدر قلق رئيسي لللغويات التاريخية.نحن نحقق في التحقيق اللغوي في الاختلاف الدلالي من أزواج المعروفة باللغات الإنجليزية والرومانسية، عن طريق تدمير Word.تحقيقا لهذه الغاية، نقدم مجموعة بيانات جديدة من الإدراك في جميع أزواج تلك اللغات.نحن نصف أنواع الأخطاء التي حدثت خلال عملية تحديد المعرفة الآلية وتصحيحها يدويا.بالإضافة إلى ذلك، نسمي أن نملصق اللغة الإنجليزية وفقا لالئصال الخاصة بهم، وفصلها إلى مجموعتين: القروض القديمة والقروض الأخيرة.في هذه البيانات المنوية، نقوم بتحليل خصائص Word مثل التردد والبولزيمي، وتوزيع درجات التشابه بين مجموعات مختلفة بلغات مختلفة.نحدد تلقائيا مجموعات مختلفة من الإنجليز، وضع اتجاه جديد للبحث في cognates، والقروض وتحليل الأصدقاء الخاطئين في لغات ذات صلة.
Semantic divergence in related languages is a key concern of historical linguistics. We cross-linguistically investigate the semantic divergence of cognate pairs in English and Romance languages, by means of word embeddings. To this end, we introduce a new curated dataset of cognates in all pairs of those languages. We describe the types of errors that occurred during the automated cognate identification process and manually correct them. Additionally, we label the English cognates according to their etymology, separating them into two groups: old borrowings and recent borrowings. On this curated dataset, we analyse word properties such as frequency and polysemy, and the distribution of similarity scores between cognate sets in different languages. We automatically identify different clusters of English cognates, setting a new direction of research in cognates, borrowings and possibly false friends analysis in related languages.
References used
https://aclanthology.org/
Computational resources such as semantically annotated corpora can play an important role in enabling speakers of indigenous minority languages to participate in government, education, and other domains of public life in their own language. However,
Eye-tracking psycholinguistic studies have suggested that context-word semantic coherence and predictability influence language processing during the reading activity. In this study, we investigate the correlation between the cosine similarities comp
Graph-based semantic parsing aims to represent textual meaning through directed graphs. As one of the most promising general-purpose meaning representations, these structures and their parsing have gained a significant interest momentum during recent
Semantic textual similarity (STS) systems estimate the degree of the meaning similarity between two sentences. Cross-lingual STS systems estimate the degree of the meaning similarity between two sentences, each in a different language. State-of-the-a
We present a manually annotated lexical semantic change dataset for Russian: RuShiftEval. Its novelty is ensured by a single set of target words annotated for their diachronic semantic shifts across three time periods, while the previous work either