حدد اللغويين التاريخيون منتظمين في عملية تغيير الصوت التاريخي.تستخدم الطريقة المقارنة تلك الأوستانتيات لإعادة بناء الكلمات البروتو بناء على النماذج الملحوظة بلغات الابنة.هل يمكن أن تتألف هذه العملية بكفاءة؟نحن نبذة عن مهمة إعادة إعمار بروتو كلمة، والتي يتعرض فيها النموذج للدراجات في لغات ابنة المعاصرة، ويتعين عليها التنبؤ بكلمة البروتو في لغة الجد.نحن نقدم مجموعة بيانات جديدة لهذه المهمة، والتي تشمل أكثر من 8000 مداخل مقارنة، وإظهار أن نماذج التسلسل العصبي تفوق الطرق التقليدية المطبقة على هذه المهمة حتى الآن.يكشف تحليل الأخطاء عن تقلب في قدرة النموذج العصبي لالتقاط تغييرات صوتية مختلفة، وارتباطا بعقد التغييرات.يكشف تحليل المدينات المستفادة أن نماذج تتعلم التعميمات ذات مغزى لفونيا، مما يتوافق مع التحولات الصوتيات المصادفة جيدا وثقنها اللغويات التاريخية.
Historical linguists have identified regularities in the process of historic sound change. The comparative method utilizes those regularities to reconstruct proto-words based on observed forms in daughter languages. Can this process be efficiently automated? We address the task of proto-word reconstruction, in which the model is exposed to cognates in contemporary daughter languages, and has to predict the proto word in the ancestor language. We provide a novel dataset for this task, encompassing over 8,000 comparative entries, and show that neural sequence models outperform conventional methods applied to this task so far. Error analysis reveals a variability in the ability of neural model to capture different phonological changes, correlating with the complexity of the changes. Analysis of learned embeddings reveals the models learn phonologically meaningful generalizations, corresponding to well-attested phonological shifts documented by historical linguistics.
References used
https://aclanthology.org/
We propose a deep generative model that performs typography analysis and font reconstruction by learning disentangled manifolds of both font style and character shape. Our approach enables us to massively scale up the number of character types we can
This research includes a geodetic study for the rehabilitation of damaged bridge
cranes axes, its reconstruction and calibration in order to invest in the production process.
The beginning was devoted to studying the types of bridge cranes used in
A private learning scheme TextHide was recently proposed to protect the private text data during the training phase via so-called instance encoding. We propose a novel reconstruction attack to break TextHide by recovering the private training data, a
Recent progress in language modeling has been driven not only by advances in neural architectures, but also through hardware and optimization improvements. In this paper, we revisit the neural probabilistic language model (NPLM) of Bengio et al. (200
We propose a multi-task, probabilistic approach to facilitate distantly supervised relation extraction by bringing closer the representations of sentences that contain the same Knowledge Base pairs. To achieve this, we bias the latent space of senten