تقدم هذه الورقة Norecneg - مجموعة بيانات النفي الأولى المشروح للنرويجية.تم تفاح الإشارات النفي والعنوان الواحدة في الجملة عبر أكثر من 11 ألف جمل تمتد أكثر من 400 وثيقة لمجموعة فرعية من الاستعراض النرويجي Corpus (Norec).بالإضافة إلى تقديم مناقشة متعمقة للمبادئ التوجيهية التوضيحية، نقدم أيضا مجموعة أولى من النتائج القياسية المستندة إلى نهج بياني لتحليل الرسم البياني.
This paper introduces NorecNeg -- the first annotated dataset of negation for Norwegian. Negation cues and their in-sentence scopes have been annotated across more than 11K sentences spanning more than 400 documents for a subset of the Norwegian Review Corpus (NoReC). In addition to providing in-depth discussion of the annotation guidelines, we also present a first set of benchmark results based on a graph-parsing approach.
References used
https://aclanthology.org/
Online misogyny has become an increasing worry for Arab women who experience gender-based online abuse on a daily basis. Misogyny automatic detection systems can assist in the prohibition of anti-women Arabic toxic content. Developing such systems is
Negation scope resolution is key to high-quality information extraction from clinical texts, but so far, efforts to make encoders used for information extraction negation-aware have been limited to English. We present a universal approach to multilin
Building tools to remove sensitive information such as personal names, addresses, and telephone numbers - so called Protected Health Information (PHI) - from clinical free text is an important task to make clinical texts available for research. These
This paper presents StoryDB --- a broad multi-language dataset of narratives. StoryDB is a corpus of texts that includes stories in 42 different languages. Every language includes 500+ stories. Some of the languages include more than 20 000 stories.
Multi-turn response selection models have recently shown comparable performance to humans in several benchmark datasets. However, in the real environment, these models often have weaknesses, such as making incorrect predictions based heavily on super