Do you want to publish a course? Click here

Personality Trait Identification Using the Russian Feature Extraction Toolkit

شخصية شخصية الهوية باستخدام مجموعة أدوات استخراج الميزة الروسية

300   0   0   0.0 ( 0 )
 Publication date 2021
and research's language is English
 Created by Shamra Editor




Ask ChatGPT about the research

Feature engineering is an important step in classical NLP pipelines, but machine learning engineers may not be aware of the signals to look for when processing foreign language text. The Russian Feature Extraction Toolkit (RFET) is a collection of feature extraction libraries bundled for ease of use by engineers who do not speak Russian. RFET's current feature set includes features applicable to social media genres of text and to computational social science tasks. We demonstrate the effectiveness of the tool by using it in a personality trait identification task. We compare the performance of Support Vector Machines (SVMs) trained with and without the features provided by RFET; we also compare it to a SVM with neural embedding features generated by Sentence-BERT.



References used
https://aclanthology.org/
rate research

Read More

This research aims at showing the importance of the child drama on activating child's imagination and developing his creative talents and abilities. It also aims at tracing drama history, its origins, and kinds in poetry, prose and drama fields. It also aims at showing the effect of school drama on the development of child personality and activating the processes of creation and technical creativity.
Endowing a task-oriented dialogue system with adaptiveness to user personality can greatly help improve the performance of a dialogue task. However, such a dialogue system can be practically challenging to implement, because it is unclear how user pe rsonality influences dialogue task performance. To explore the relationship between user personality and dialogue task performance, we enrolled participants via crowdsourcing to first answer specified personality questionnaires and then chat with a dialogue system to accomplish assigned tasks. A rule-based dialogue system on the prevalent Multi-Domain Wizard-of-Oz (MultiWOZ) task was used. A total of 211 participants' personalities and their 633 dialogues were collected and analyzed. The results revealed that sociable and extroverted people tended to fail the task, whereas neurotic people were more likely to succeed. We extracted features related to user dialogue behaviors and performed further analysis to determine which kind of behavior influences task performance. As a result, we identified that average utterance length and slots per utterance are the key features of dialogue behavior that are highly correlated with both task performance and user personality.
The paper describes the TenTrans's submissions to the WMT 2021 Efficiency Shared Task. We explore training a variety of smaller compact transformer models using the teacher-student setup. Our model is trained by our self-developed open-source multili ngual training platform TenTrans-Py. We also release an open-source high-performance inference toolkit for transformer models and the code is written in C++ completely. All additional optimizations are built on top of the inference engine including attention caching, kernel fusion, early-stop, and several other optimizations. In our submissions, the fastest system can translate more than 22,000 tokens per second with a single Tesla P4 while maintaining 38.36 BLEU on En-De newstest2019. Our trained models and more details are available in TenTrans-Decoding competition examples.
We present an open-source toolkit for Danish Natural Language Processing, enabling easy access to Danish NLP's latest advancements. The toolkit features wrapper-functions for loading models and datasets in a unified way using third-party NLP framewor ks. The toolkit is developed to enhance community building, understanding the need from industry and knowledge sharing. As an example of this, we present Angry Tweets: An Annotation Game to create awareness of Danish NLP and create a new sentiment-annotated dataset.
The paper reports the results of a translationese study of literary texts based on translated and non-translated Russian. We aim to find out if translations deviate from non-translated literary texts, and if the established differences can be attribu ted to typological relations between source and target languages. We expect that literary translations from typologically distant languages should exhibit more translationese, and the fingerprints of individual source languages (and their families) are traceable in translations. We explore linguistic properties that distinguish non-translated Russian literature from translations into Russian. Our results show that non-translated fiction is different from translations to the degree that these two language varieties can be automatically classified. As expected, language typology is reflected in translations of literary texts. We identified features that point to linguistic specificity of Russian non-translated literature and to shining-through effects. Some of translationese features cut across all language pairs, while others are characteristic of literary translations from languages belonging to specific language families.

suggested questions

comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا