New community

Subscribe to the gold package and get unlimited access to Shamra Academy

BERT Goes Brrr: A Venture Towards the Lesser Error in Classifying Medical Self-Reporters on Twitter

BERT GOAN BRRR: مشروعا تجاه خطأ أقل في تصنيف مراسلين ذوي الذات الطبي على تويتر

43 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

classifying medical self-reporters lesser error self-reporters on twitter تصنيف مراسلين الذاتيين الطبية خطأ أقل مراسلون ذاتيا على تويتر صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper describes our team's submission for the Social Media Mining for Health (SMM4H) 2021 shared task. We participated in three subtasks: Classifying adverse drug effect, COVID-19 self-report, and COVID-19 symptoms. Our system is based on BERT model pre-trained on the domain-specific text. In addition, we perform data cleaning and augmentation, as well as hyperparameter optimization and model ensemble to further boost the BERT performance. We achieved the first rank in both classifying adverse drug effects and COVID-19 self-report tasks.

References used

https://aclanthology.org/

rate research

Opinions Mining in Twitter

2337 - Aِl-Baath University 2016 ورقة بحثية

We bring the data from the social networking site Twitter pages, and then we have worked on cleaning and processing operation to the text of for the classification process texts retrieved contain a lot of noise and information is useful for the pr ocess of analyzing the views, such as advertisements and links and e-mail addresses and the presence of many words that do not affect the general orientation of the text, and then get all the publications in the Twitter page and what are the comments about each tweets is intended to know the proportion of supporters and opponents of this publication. We apply Naïve Bayes algorithm in classification, we had the appropriate training, and after passing Posts and comments data (opinions), we got good results on the ratio of supporters of the post and the percentage of his opponents.

شفرة الوصول Access token تصنيف المشاعر التنقيب في الآراء Opinions mining Sentiment classification

Arabic Offensive Language on Twitter: Analysis and Experiments

227 - Association for Computation Linguistics 2021 مقالة

Detecting offensive language on Twitter has many applications ranging from detecting/predicting bullying to measuring polarization. In this paper, we focus on building a large Arabic offensive tweet dataset. We introduce a method for building a datas et that is not biased by topic, dialect, or target. We produce the largest Arabic dataset to date with special tags for vulgarity and hate speech. We thoroughly analyze the dataset to determine which topics, dialects, and gender are most associated with offensive tweets and how Arabic speakers useoffensive language. Lastly, we conduct many experiments to produce strong results (F1 =83.2) on the dataset using SOTA techniques.

language on twitter arabic offensive language اللغة على Twitter اللغة الهجومية العربية صناعة حمض الفوسفور

Hate Towards the Political Opponent: A Twitter Corpus Study of the 2020 US Elections on the Basis of Offensive Speech and Stance Detection

247 - Association for Computation Linguistics 2021 مقالة

The 2020 US Elections have been, more than ever before, characterized by social media campaigns and mutual accusations. We investigate in this paper if this manifests also in online communication of the supporters of the candidates Biden and Trump, b y uttering hateful and offensive communication. We formulate an annotation task, in which we join the tasks of hateful/offensive speech detection and stance detection, and annotate 3000 Tweets from the campaign period, if they express a particular stance towards a candidate. Next to the established classes of favorable and against, we add mixed and neutral stances and also annotate if a candidate is mentioned with- out an opinion expression. Further, we an- notate if the tweet is written in an offensive style. This enables us to analyze if supporters of Joe Biden and the Democratic Party communicate differently than supporters of Donald Trump and the Republican Party. A BERT baseline classifier shows that the detection if somebody is a supporter of a candidate can be performed with high quality (.89 F1 for Trump and .91 F1 for Biden), while the detection that somebody expresses to be against a candidate is more challenging (.79 F1 and .64 F1, respectively). The automatic detection of hate/offensive speech remains challenging (with .53 F1). Our corpus is publicly available and constitutes a novel resource for computational modelling of offensive language under consideration of stances.

twitter corpus study political opponent twitter corpus Twitter Corpus الدراسة الخصم السياسي تويتر كوربوس صناعة حمض الفوسفور المزيد..

Application of Mix-Up Method in Document Classification Task Using BERT

185 - Association for Computation Linguistics 2021 مقالة

The mix-up method (Zhang et al., 2017), one of the methods for data augmentation, is known to be easy to implement and highly effective. Although the mix-up method is intended for image identification, it can also be applied to natural language proce ssing. In this paper, we attempt to apply the mix-up method to a document classification task using bidirectional encoder representations from transformers (BERT) (Devlin et al., 2018). Since BERT allows for two-sentence input, we concatenated word sequences from two documents with different labels and used the multi-class output as the supervised data with a one-hot vector. In an experiment using the livedoor news corpus, which is Japanese, we compared the accuracy of document classification using two methods for selecting documents to be concatenated with that of ordinary document classification. As a result, we found that the proposed method is better than the normal classification when the documents with labels shortages are mixed preferentially. This indicates that how to choose documents for mix-up has a significant impact on the results.

mix-up method document classification task document classification طريقة خلط مهمة تصنيف المستندات صناعة حمض الفوسفور

Accountable Error Characterization

49 - Association for Computation Linguistics 2021 مقالة

Customers of machine learning systems demand accountability from the companies employing these algorithms for various prediction tasks. Accountability requires understanding of system limit and condition of erroneous predictions, as customers are oft en interested in understanding the incorrect predictions, and model developers are absorbed in finding methods that can be used to get incremental improvements to an existing system. Therefore, we propose an accountable error characterization method, AEC, to understand when and where errors occur within the existing black-box models. AEC, as constructed with human-understandable linguistic features, allows the model developers to automatically identify the main sources of errors for a given classification system. It can also be used to sample for the set of most informative input points for a next round of training. We perform error detection for a sentiment analysis task using AEC as a case study. Our results on the sample sentiment task show that AEC is able to characterize erroneous predictions into human understandable categories and also achieves promising results on selecting erroneous samples when compared with the uncertainty-based sampling.

accountable error characterization error characterization aec توصيف خطأ المساءلة توصيف الخطأ AEC. صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

BERT Goes Brrr: A Venture Towards the Lesser Error in Classifying Medical Self-Reporters on Twitter

BERT GOAN BRRR: مشروعا تجاه خطأ أقل في تصنيف مراسلين ذوي الذات الطبي على تويتر

Ask ChatGPT about the research

Read More

suggested questions