في السنوات الأخيرة، استخدم عدد من الدراسات نماذج خطية لتنبؤ بالشخصية بناء على النص.في هذه الورقة، نحن نحلل تجريبيا ومقارنة الإشارات المعجمية التي تم التقاطها في هذه النماذج.نحدد الإشارات المعجمية لكل بعدة من مخطط شخصية MBTI بعدة طرق مختلفة، مع الأخذ في الاعتبار مجموعات بيانات مختلفة ومجموعات ميزة وغوارزمية التعلم.نقوم بإجراء سلسلة من تحليلات الارتباط بين بيانات MBTI الناتجة واستكشاف اتصالهم بالإشارات الأخرى، مثل السمات الكبيرة الخمسة والعاطفة والشاحات والعمر والجنس.يظهر التحليل أنماط الارتباط المثيرة للاهتمام بين أبعاد الشخصية المختلفة والسمات الأخرى، وتقدم أيضا أدلة على تقلب البيانات.
In recent years, a number of studies have used linear models for personality prediction based on text. In this paper, we empirically analyze and compare the lexical signals captured in such models. We identify lexical cues for each dimension of the MBTI personality scheme in several different ways, considering different datasets, feature sets, and learning algorithms. We conduct a series of correlation analyses between the resulting MBTI data and explore their connection to other signals, such as for Big-5 traits, emotion, sentiment, age, and gender. The analysis shows intriguing correlation patterns between different personality dimensions and other traits, and also provides evidence for the robustness of the data.
References used
https://aclanthology.org/
Neural Machine Translation models are sensitive to noise in the input texts, such as misspelled words and ungrammatical constructions. Existing robustness techniques generally fail when faced with unseen types of noise and their performance degrades
After a neural sequence model encounters an unexpected token, can its behavior be predicted? We show that RNN and transformer language models exhibit structured, consistent generalization in out-of-distribution contexts. We begin by introducing two i
Personality and demographics are important variables in social sciences and computational sociolinguistics. However, datasets with both personality and demographic labels are scarce. To address this, we present PANDORA, the first dataset of Reddit co
This study aims at examining the relationship of fluid intelligence, as
measured by the standard progressive matrices test (SPM PlUS) with
some personality traits as measured by personal Eysenk scale of the
personality, within a sample of students at the University of Damascus
composed of 498 students, including 205 males and 293 females, and
from theoretical and practical disciplines .
The SemLink resource provides mappings between a variety of lexical semantic ontologies, each with their strengths and weaknesses. To take advantage of these differences, the ability to move between resources is essential. This work describes advance