في الوقت الحاضر، حقق التعرف على الكيان المسمى (NER) نتائج ممتازة على الشركة القياسية.ومع ذلك، فإن المشكلات الكبيرة تنشأ مع الحاجة إلى تطبيق في مجال معين، لأنه يتطلب جدارا الشكل المشروح مع مجموعة علامات NE مكيفة.هذا واضح بشكل خاص في مجال معالجة المستندات التاريخية.يتكون الهدف الرئيسي لهذه الورقة من اقتراح وتقييم العديد من طرق تعلم النقل لزيادة درجة النقر التاريخي التشيكي.ندرس العديد من مصادر المعلومات، ونحن نستخدم شباكين عصبيين للنمذجة والاعتراف.نحن نوظف سورانيا لتقييم أساليب التعلم الخاصة بنا، وهي Czech Named Entity Corpus و Czech Historical Enty Enty Engyity Corpus.نظهر أن تمثيل بيرت بضبط جيد وفقط المصنف البسيط المدرب على اتحاد كورسيا يحقق نتائج ممتازة.
Nowadays, named entity recognition (NER) achieved excellent results on the standard corpora. However, big issues are emerging with a need for an application in a specific domain, because it requires a suitable annotated corpus with adapted NE tag-set. This is particularly evident in the historical document processing field. The main goal of this paper consists of proposing and evaluation of several transfer learning methods to increase the score of the Czech historical NER. We study several information sources, and we use two neural nets for NE modeling and recognition. We employ two corpora for evaluation of our transfer learning methods, namely Czech named entity corpus and Czech historical named entity corpus. We show that BERT representation with fine-tuning and only the simple classifier trained on the union of corpora achieves excellent results.
References used
https://aclanthology.org/
Meta-learning has recently been proposed to learn models and algorithms that can generalize from a handful of examples. However, applications to structured prediction and textual tasks pose challenges for meta-learning algorithms. In this paper, we a
The use of Named Entity Recognition (NER) over archaic Arabic texts is steadily increasing. However, most tools have been either developed for modern English or trained over English language documents and are limited over historical Arabic text. Even
Abstract We take a step towards addressing the under- representation of the African continent in NLP research by bringing together different stakeholders to create the first large, publicly available, high-quality dataset for named entity recognition
Current work in named entity recognition (NER) shows that data augmentation techniques can produce more robust models. However, most existing techniques focus on augmenting in-domain data in low-resource scenarios where annotated data is quite limite
Entity Linking (EL) systems have achieved impressive results on standard benchmarks mainly thanks to the contextualized representations provided by recent pretrained language models. However, such systems still require massive amounts of data -- mill