اعتمدت نهج استخراج المعلومات الحديثة على تدريب النماذج العصبية العميقة. ومع ذلك، يمكن أن تتجاوز هذه النماذج بسهولة الملصقات الصاخبة وتعاني من تدهور الأداء. في حين أنه من المكلف للغاية تصفية الملصقات الصاخبة في موارد تعليمية كبيرة، فإن الدراسات الحديثة تظهر أن مثل هذه الملصقات تتخذ المزيد من الخطوات التدريبية التي سيتم حفظها وتكون نسيانها بشكل أكثر تواترا من الملصقات النظيفة، وبالتالي يتم تحديدها في التدريب. بدافع من هذه الخصائص، نقترح إطارا بسيطا بانتظام بسيطة لاستخراج المعلومات التركز على الكيان، والذي يتكون من العديد من النماذج العصبية مع هياكل متطابقة ولكن تهيئة معلمة مختلفة. يتم تحسين هذه النماذج بشكل مشترك مع الخسائر الخاصة بالمهمة ويتم تنظيمها لتوليد تنبؤات مماثلة تستند إلى فقدان اتفاقية، تمنع التجديدات الخارجية على الملصقات الصاخبة. تظهر تجارب واسعة على نطاق واسع على نطاق واسع ولكن صاخبة لاستخراج المعلومات، Tacred و Conll03، فعالية إطار عملنا. نطلق سرد علاماتنا للمجتمع للبحث في المستقبل.
Recent information extraction approaches have relied on training deep neural models. However, such models can easily overfit noisy labels and suffer from performance degradation. While it is very costly to filter noisy labels in large learning resources, recent studies show that such labels take more training steps to be memorized and are more frequently forgotten than clean labels, therefore are identifiable in training. Motivated by such properties, we propose a simple co-regularization framework for entity-centric information extraction, which consists of several neural models with identical structures but different parameter initialization. These models are jointly optimized with the task-specific losses and are regularized to generate similar predictions based on an agreement loss, which prevents overfitting on noisy labels. Extensive experiments on two widely used but noisy benchmarks for information extraction, TACRED and CoNLL03, demonstrate the effectiveness of our framework. We release our code to the community for future research.
References used
https://aclanthology.org/
In order to alleviate the huge demand for annotated datasets for different tasks, many recent natural language processing datasets have adopted automated pipelines for fast-tracking usable data. However, model training with such datasets poses a chal
Information extraction and question answering have the potential to introduce a new paradigm for how machine learning is applied to criminal law. Existing approaches generally use tabular data for predictive metrics. An alternative approach is needed
Linguistic typology is an area of linguistics concerned with analysis of and comparison between natural languages of the world based on their certain linguistic features. For that purpose, historically, the area has relied on manual extraction of lin
Training NLP systems typically assumes access to annotated data that has a single human label per example. Given imperfect labeling from annotators and inherent ambiguity of language, we hypothesize that single label is not sufficient to learn the sp
Meta-learning has recently been proposed to learn models and algorithms that can generalize from a handful of examples. However, applications to structured prediction and textual tasks pose challenges for meta-learning algorithms. In this paper, we a