تشعر الدراسات المرتبطة بالتبغ المعاصرة في الغالب بمنصة وسائط اجتماعية واحدة أثناء تفويتها على جمهور أوسع.علاوة على ذلك، فإنها تعتمد بشدة على مجموعات البيانات المسمى، وهي مكلفة لجعلها.في هذا العمل، نستكشف المعنويات وتحديد المنتج على النص المتعلق بالتبغ من منصات وسائط التواصل الاجتماعي.نطلق سراح مجموعات البيانات المرسلة - Twitter و Relismoke-Reddit، إلى جانب مخطط شرح شامل لتحديد شعور منتجات التبغ.ثم نقوم بإجراء تجارب تصنيف النص باستخدام النماذج الحديثة، بما في ذلك بيرت روبرتا، والتقطير.تظهر تجاربنا نتائج F1 تصل إلى 0.72 لتحديد المعنويات في DataSet Twitter، 0.46 لتحديد المعنويات، و 0.57 للحصول على تحديد المنتج باستخدام تعلم شبه إشرافه Reddit.
Contemporary tobacco-related studies are mostly concerned with a single social media platform while missing out on a broader audience. Moreover, they are heavily reliant on labeled datasets, which are expensive to make. In this work, we explore sentiment and product identification on tobacco-related text from two social media platforms. We release SentiSmoke-Twitter and SentiSmoke-Reddit datasets, along with a comprehensive annotation schema for identifying tobacco products' sentiment. We then perform benchmarking text classification experiments using state-of-the-art models, including BERT, RoBERTa, and DistilBERT. Our experiments show F1 scores as high as 0.72 for sentiment identification in the Twitter dataset, 0.46 for sentiment identification, and 0.57 for product identification using semi-supervised learning for Reddit.
References used
https://aclanthology.org/
In the midst of a global pandemic, understanding the public's opinion of their government's policy-level, non-pharmaceutical interventions (NPIs) is a crucial component of the health-policy-making process. Prior work on CoViD-19 NPI sentiment analysi
In this work, we provide an extensive part-of-speech analysis of the discourse of social media users with depression. Research in psychology revealed that depressed users tend to be self-focused, more preoccupied with themselves and ruminate more abo
The speech act of complaining is used by humans to communicate a negative mismatch between reality and expectations as a reaction to an unfavorable situation. Linguistic theory of pragmatics categorizes complaints into various severity levels based o
Sarcasm is a linguistic expression often used to communicate the opposite of what is said, usually something that is very unpleasant with an intention to insult or ridicule. Inherent ambiguity in sarcastic expressions makes sarcasm detection very dif
This paper describes the Helsinki--Ljubljana contribution to the VarDial 2021 shared task on social media variety geolocation. Following our successful participation at VarDial 2020, we again propose constrained and unconstrained systems based on the