تصف الورقة مشاركة فريق Lasig-Biotm في المسارات الفرعية A و B من PROFORNER، والتي تستند إلى: I) نموذج Bilstm-CRF الذي يرفع إلى تضيير الكلمات السياقية والكلفة الكلاسيكية للاعتراف بتذكر وتصنيفها، والثاني)على وحدة نمطية قائمة على القواعد لتصنيف تغريدات.في مرحلة التقييم، حقق نموذجنا درجة F1 من 0.917 (0،031 أكثر من المتوسط) في المسار الفرعي ونتيجة F1 من 0.727 (0،034 أقل من المتوسط) في المسار الفرعي B.
The paper describes the participation of the Lasige-BioTM team at sub-tracks A and B of ProfNER, which was based on: i) a BiLSTM-CRF model that leverages contextual and classical word embeddings to recognize and classify the mentions, and ii) on a rule-based module to classify tweets. In the Evaluation phase, our model achieved a F1-score of 0.917 (0,031 more than the median) in sub-track A and a F1-score of 0.727 (0,034 less than the median) in sub-track B.
References used
https://aclanthology.org/
This paper describes the participation of SINAI team at Task 5: Toxic Spans Detection which consists of identifying spans that make a text toxic. Although several resources and systems have been developed so far in the context of offensive language,
This paper presents our findings from participating in the SMM4H Shared Task 2021. We addressed Named Entity Recognition (NER) and Text Classification. To address NER we explored BiLSTM-CRF with Stacked Heterogeneous embeddings and linguistic feature
While the production of information in the European early modern period is a well-researched topic, the question how people were engaging with the information explosion that occurred in early modern Europe, is still underexposed. This paper presents
As a result of unstructured sentences and some misspellings and errors, finding named entities in a noisy environment such as social media takes much more effort. ParsTwiNER contains about 250k tokens, based on standard instructions like MUC-6 or CoN
Entity Linking (EL) systems have achieved impressive results on standard benchmarks mainly thanks to the contextualized representations provided by recent pretrained language models. However, such systems still require massive amounts of data -- mill