Do you want to publish a course? Click here

Identifying professions \& occupations in Health-related Social Media using Natural Language Processing

تحديد المهن \ والمهن في وسائل التواصل الاجتماعي المرتبطة بالصحة باستخدام معالجة اللغة الطبيعية

271   0   0   0.0 ( 0 )
 Publication date 2021
and research's language is English
 Created by Shamra Editor




Ask ChatGPT about the research

This paper describes the entry of the research group SINAI at SMM4H's ProfNER task on the identification of professions and occupations in social media related with health. Specifically we have participated in Task 7a: Tweet Binary Classification to determine whether a tweet contains mentions of occupations or not, as well as in Task 7b: NER Offset Detection and Classification aimed at predicting occupations mentions and classify them discriminating by professions and working statuses.



References used
https://aclanthology.org/
rate research

Read More

ProfNER-ST focuses on the recognition of professions and occupations from Twitter using Spanish data. Our participation is based on a combination of word-level embeddings, including pre-trained Spanish BERT, as well as cosine similarity computed over a subset of entities that serve as input for an encoder-decoder architecture with attention mechanism. Finally, our best score achieved an F1-measure of 0.823 in the official test set.
This paper presents our contribution to the ProfNER shared task. Our work focused on evaluating different pre-trained word embedding representations suitable for the task. We further explored combinations of embeddings in order to improve the overall results.
Social media texts such as blog posts, comments, and tweets often contain offensive languages including racial hate speech comments, personal attacks, and sexual harassment. Detecting inappropriate use of language is, therefore, of utmost importance for the safety of the users as well as for suppressing hateful conduct and aggression. Existing approaches to this problem are mostly available for resource-rich languages such as English and German. In this paper, we characterize the offensive language in Nepali, a low-resource language, highlighting the challenges that need to be addressed for processing Nepali social media text. We also present experiments for detecting offensive language using supervised machine learning. Besides contributing the first baseline approaches of detecting offensive language in Nepali, we also release human annotated data sets to encourage future research on this crucial topic.
In this paper we study pejorative language, an under-explored topic in computational linguistics. Unlike existing models of offensive language and hate speech, pejorative language manifests itself primarily at the lexical level, and describes a word that is used with a negative connotation, making it different from offensive language or other more studied categories. Pejorativity is also context-dependent: the same word can be used with or without pejorative connotations, thus pejorativity detection is essentially a problem similar to word sense disambiguation. We leverage online dictionaries to build a multilingual lexicon of pejorative terms for English, Spanish, Italian, and Romanian. We additionally release a dataset of tweets annotated for pejorative use. Based on these resources, we present an analysis of the usage and occurrence of pejorative words in social media, and present an attempt to automatically disambiguate pejorative usage in our dataset.
Given the current social distancing regulations across the world, social media has become the primary mode of communication for most people. This has isolated millions suffering from mental illnesses who are unable to receive assistance in person. Th ey have increasingly turned to online platforms to express themselves and to look for guidance in dealing with their illnesses. Keeping this in mind, we propose a solution to classify mental illness posts on social media thereby enabling users to seek appropriate help. In this work, we classify five prominent kinds of mental illnesses- depression, anxiety, bipolar disorder, ADHD and PTSD by analyzing unstructured user data on Reddit. In addition, we share a new high-quality dataset1 to drive research on this topic. The dataset consists of the title and post texts from 17159 posts and 13 subreddits each associated with one of the five mental illnesses listed above or a None class indicating the absence of any mental illness. Our model is trained on Reddit data but is easily extensible to other social media platforms as well as demonstrated in our results.We believe that our work is the first multi-class model that uses a Transformer based architecture such as RoBERTa to analyze people's emotions and psychology. We also demonstrate how we stress test our model using behavioral testing. Our dataset is publicly available and we encourage researchers to utilize this to advance research in this arena. We hope that this work contributes to the public health system by automating some of the detection process and alerting relevant authorities about users that need immediate help.

suggested questions

comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا