غالبا ما يتطلب تدريب نماذج NLP كميات كبيرة من بيانات التدريب المسمى، مما يجعل من الصعب توسيع النماذج الحالية لغات جديدة.في حين تعتمد Transfer-Transfer عبر اللغات الصفرية على تضييق كلمة متعددة اللغات لتطبيق نموذج تدرب على لغة واحدة لآخر، فإن Yarowski و Ngai (2001) يقترح طريقة الإسقاط التوضيحي لتوليد بيانات التدريب دون شرح يدوي.تم استخدام هذه الطريقة بنجاح مهام التعرف على الكيان المسمى وكتابة الكيان الخشن الخشبي، لكننا نظهر أنه من غير متوقع من قبل النقل الصفرية عبر اللغات عند تطبيقها على مهمة مماثلة لكتابة الكيان المحبوس.في دراستنا لطبقتها الجميلة للكتابة من نوع الغش في علم الأطباق الألمانية بالنسبة للألمانية، نظهر أن الإسقاط التوضيحي يضخم ميل النموذج الإنجليزي إلى تسميات المستوى 2 المستويات والضرب عن طريق النقل الصفرية عبر اللغات على ثلاثة مجموعات اختبار رواية.
The training of NLP models often requires large amounts of labelled training data, which makes it difficult to expand existing models to new languages. While zero-shot cross-lingual transfer relies on multilingual word embeddings to apply a model trained on one language to another, Yarowski and Ngai (2001) propose the method of annotation projection to generate training data without manual annotation. This method was successfully used for the tasks of named entity recognition and coarse-grained entity typing, but we show that it is outperformed by zero-shot cross-lingual transfer when applied to the similar task of fine-grained entity typing. In our study of fine-grained entity typing with the FIGER type ontology for German, we show that annotation projection amplifies the English model's tendency to underpredict level 2 labels and is beaten by zero-shot cross-lingual transfer on three novel test sets.
References used
https://aclanthology.org/
Analyzing microblogs where we post what we experience enables us to perform various applications such as social-trend analysis and entity recommendation. To track emerging trends in a variety of areas, we want to categorize information on emerging en
Conventional entity typing approaches are based on independent classification paradigms, which make them difficult to recognize inter-dependent, long-tailed and fine-grained entity types. In this paper, we argue that the implicitly entailed extrinsic
Existing work on Fine-grained Entity Typing (FET) typically trains automatic models on the datasets obtained by using Knowledge Bases (KB) as distant supervision. However, the reliance on KB means this training setting can be hampered by the lack of
Multilingual pre-trained models have achieved remarkable performance on cross-lingual transfer learning. Some multilingual models such as mBERT, have been pre-trained on unlabeled corpora, therefore the embeddings of different languages in the models
In cross-lingual Abstract Meaning Representation (AMR) parsing, researchers develop models that project sentences from various languages onto their AMRs to capture their essential semantic structures: given a sentence in any language, we aim to captu