نقدم Dreamdrug، مجموعة بيانات التعيد الجماعي للكشف عن ذكرات الأدوية في قوائم البند الناتجة عن المستخدم الصاخبة من أسواق Darknet.تحتوي DataSet لدينا على ما يقرب من 15000 كيانات مخطوية مشروح يدويا في أكثر من 3500 من قوائم البند كشط من منصة Darknet Market Dreammarket '' '' '' '' 'في عام 2017. نحن أيضا تدريب ونماذج خط الأساس للكشف عن هذه الكيانات، باستخدام نماذج اللغة السياقية التي تم ضبطها بشكل صحيحالإعداد وعلى مجموعة البيانات كاملة، وفحص تأثير الاحتجاج على الفورورا غير المخلفات داخل المجال.
We present DreamDrug, a crowdsourced dataset for detecting mentions of drugs in noisy user-generated item listings from darknet markets. Our dataset contains nearly 15,000 manually annotated drug entities in over 3,500 item listings scraped from the darknet market platform DreamMarket'' in 2017. We also train and evaluate baseline models for detecting these entities, using contextual language models fine-tuned in a few-shot setting and on the full dataset, and examine the effect of pretraining on in-domain unannotated corpora.
References used
https://aclanthology.org/
The stance detection task aims at detecting the stance of a tweet or a text for a target. These targets can be named entities or free-form sentences (claims). Though the task involves reasoning of the tweet with respect to a target, we find that it i
In Romanian language there are some resources for automatic text comprehension, but for Emotion Detection, not lexicon-based, there are none. To cover this gap, we extracted data from Twitter and created the first dataset containing tweets annotated
As the world continues to fight the COVID-19 pandemic, it is simultaneously fighting an infodemic' -- a flood of disinformation and spread of conspiracy theories leading to health threats and the division of society. To combat this infodemic, there i
People utilize online forums to either look for information or to contribute it. Because of their growing popularity, certain online forums have been created specifically to provide support, assistance, and opinions for people suffering from mental i
In this paper, we introduce a new English Twitter-based dataset for cyberbullying detection and online abuse. Comprising 62,587 tweets, this dataset was sourced from Twitter using specific query terms designed to retrieve tweets with high probabiliti