Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

How to Obtain Reliable Labels for MBTI Classification from Texts?

كيفية الحصول على ملصقات موثوقة لتصنيف MBTI من النصوص؟

814 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

classification from texts obtain reliable labels mbti classification تصنيف من النصوص الحصول على ملصقات موثوقة MBTI تصنيف صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Automatic detection of the Myers-Briggs Type Indicator (MBTI) from short posts attracted noticeable attention in the last few years. Recent studies showed that this is quite a difficult task, especially on commonly used Twitter data. Obtaining MBTI labels is also difficult, as human annotation requires trained psychologists, and automatic way of obtaining them is through long questionnaires of questionable usability for the task. In this paper, we present a method for collecting reliable MBTI labels via only four carefully selected questions that can be applied to any type of textual data.

References used

https://aclanthology.org/

rate research

CHoRaL: Collecting Humor Reaction Labels from Millions of Social Media Users

697 - Association for Computation Linguistics 2021 مقالة

Humor detection has gained attention in recent years due to the desire to understand user-generated content with figurative language. However, substantial individual and cultural differences in humor perception make it very difficult to collect a lar ge-scale humor dataset with reliable humor labels. We propose CHoRaL, a framework to generate perceived humor labels on Facebook posts, using the naturally available user reactions to these posts with no manual annotation needed. CHoRaL provides both binary labels and continuous scores of humor and non-humor. We present the largest dataset to date with labeled humor on 785K posts related to COVID-19. Additionally, we analyze the expression of COVID-related humor in social media by extracting lexico-semantic and affective features from the posts, and build humor detection models with performance similar to humans. CHoRaL enables the development of large-scale humor detection models on any topic and opens a new path to the study of humor on social media.

collecting humor reaction collecting humor جمع رد الفعل فكاهة جمع الفكاهة صناعة حمض الفوسفور

Myelomeningoceles and How to Reduce their Incidence

1200 - Damascus University 2012 ورقة بحثية

Myelomeningoceles are very common anamoly in our country. Mostly it ends with permanent damage and handicap. Lot of these children die due to meningitis as a complication. It still till now a large number of children with myelo meningoceles seek me dical care in pediatric hospital and other health centers. So, we must know the reasons and the predisposing factors for the myelomeningoceles to reduce their incidence.

القيلة السحائية النخاعية حمض الفوليك التهاب السحايا Myelomeningocele Folic Acid Meningitis

Period Classification in Chinese Historical Texts

782 - Association for Computation Linguistics 2021 مقالة

In this study, we study language change in Chinese Biji by using a classification task: classifying Ancient Chinese texts by time periods. Specifically, we focus on a unique genre in classical Chinese literature: Biji (literally notebook'' or brush n otes''), i.e., collections of anecdotes, quotations, etc., anything authors consider noteworthy, Biji span hundreds of years across many dynasties and conserve informal language in written form. For these reasons, they are regarded as a good resource for investigating language change in Chinese (Fang, 2010). In this paper, we create a new dataset of 108 Biji across four dynasties. Based on the dataset, we first introduce a time period classification task for Chinese. Then we investigate different feature representation methods for classification. The results show that models using contextualized embeddings perform best. An analysis of the top features chosen by the word n-gram model (after bleaching proper nouns) confirms that these features are informative and correspond to observations and assumptions made by historical linguists.

ancient chinese texts chinese historical texts classifying ancient chinese النصوص الصينية القديمة النصوص التاريخية الصينية تصنيف الصينيين القديم صناعة حمض الفوسفور المزيد..

How to leverage the multimodal EHR data for better medical prediction?

603 - Association for Computation Linguistics 2021 مقالة

Healthcare is becoming a more and more important research topic recently. With the growing data in the healthcare domain, it offers a great opportunity for deep learning to improve the quality of service and reduce costs. However, the complexity of e lectronic health records (EHR) data is a challenge for the application of deep learning. Specifically, the data produced in the hospital admissions are monitored by the EHR system, which includes structured data like daily body temperature and unstructured data like free text and laboratory measurements. Although there are some preprocessing frameworks proposed for specific EHR data, the clinical notes that contain significant clinical value are beyond the realm of their consideration. Besides, whether these different data from various views are all beneficial to the medical tasks and how to best utilize these data remain unclear. Therefore, in this paper, we first extract the accompanying clinical notes from EHR and propose a method to integrate these data, we also comprehensively study the different models and the data leverage methods for better medical task prediction performance. The results on two prediction tasks show that our fused model with different data outperforms the state-of-the-art method without clinical notes, which illustrates the importance of our fusion method and the clinical note features.

multimodal ehr data ehr data بيانات EHR متعددة الوسائط EHR Data. صناعة حمض الفوسفور

Knowledge Distillation with Noisy Labels for Natural Language Understanding

1165 - Association for Computation Linguistics 2021 مقالة

Knowledge Distillation (KD) is extensively used to compress and deploy large pre-trained language models on edge devices for real-world applications. However, one neglected area of research is the impact of noisy (corrupted) labels on KD. We present, to the best of our knowledge, the first study on KD with noisy labels in Natural Language Understanding (NLU). We document the scope of the problem and present two methods to mitigate the impact of label noise. Experiments on the GLUE benchmark show that our methods are effective even under high noise levels. Nevertheless, our results indicate that more research is necessary to cope with label noise under the KD.

natural language understanding language understanding natural language فهم اللغة الطبيعية فهم اللغة اللغة الطبيعية صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

How to Obtain Reliable Labels for MBTI Classification from Texts?

كيفية الحصول على ملصقات موثوقة لتصنيف MBTI من النصوص؟

Ask ChatGPT about the research

Read More

suggested questions