Subscribe to the gold package and get unlimited access to Shamra Academy

Making Your Tweets More Fancy: Emoji Insertion to Texts

جعل تويتك أكثر يتوهم: إدراج الرموز التعبيرية للنصوص

768 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

في وسائل التواصل الاجتماعي، يستخدم المستخدمون بشكل متكرر صور صغيرة تسمى الرموز التعبيرية في مشاركاتها. على الرغم من أن استخدام الرموز التعبيرية في النصوص يلعب دورا رئيسيا في أنظمة الاتصالات الحديثة، فقد تم إيلاء اهتمام أقل في مواقعهم في النصوص المعينة، على الرغم من أن المستخدمين الذين يختارون بعناية ووضع الرموز التعبيرية التي تطابق رسالتها. ستعمل استكشاف مواقع الرموز التعبيرية في النصوص على تعزيز الفهم للعلاقة بين الرموز التعبيرية والنصوص. نقوم بتوسيع مهمة التنبؤ بالملصقات الرموز التعبيرية مع مراعاة معلومات مواقع الرموز التعبيرية، من خلال تعلم موقف الرموز التعبيري في تغريدة الرموز التعبيرية للتنبؤ بميزة الرموز التعبيرية. توضح النتائج أن موقف الرموز التعبيري في النصوص هو فكرة جيدة لتعزيز أداء تنبؤ التسمية الرموز التعبيرية. التقييم البشري يتحقق من صحة موقع emoji مناسب في تغريدة، ومهمةنا المقترحة قادرة على جعل تغريدات أكثر فاخرة وطبيعية. بالإضافة إلى ذلك، النظر في موقف الرموز التعبيري يمكن أن يحسن أداء مهمة الكشف عن المفارقة مقارنة بتنبؤ تسمية الرموز التعبيرية. نبلغ أيضا عن النتائج التجريبية لمجموعة البيانات المعدلة، نظرا لمشكلة البيانات الأصلية للمهمة المشتركة الأولى للتنبؤ بتسمية EMOJI في Semeval2018.

In the social media, users frequently use small images called emojis in their posts. Although using emojis in texts plays a key role in recent communication systems, less attention has been paid on their positions in the given texts, despite that users carefully choose and put an emoji that matches their post. Exploring positions of emojis in texts will enhance understanding of the relationship between emojis and texts. We extend an emoji label prediction task taking into account the information of emoji positions, by jointly learning the emoji position in a tweet to predict the emoji label. The results demonstrate that the position of emojis in texts is a good clue to boost the performance of emoji label prediction. Human evaluation validates that there exists a suitable emoji position in a tweet, and our proposed task is able to make tweets more fancy and natural. In addition, considering emoji position can further improve the performance for the irony detection task compared to the emoji label prediction. We also report the experimental results for the modified dataset, due to the problem of the original dataset for the first shared task to predict an emoji label in SemEval2018.

References used

https://aclanthology.org/

rate research

Arabic Emoji Sentiment Lexicon (Arab-ESL): A Comparison between Arabic and European Emoji Sentiment Lexicons

814 - Association for Computation Linguistics 2021 مقالة

Emoji (the popular digital pictograms) are sometimes seen as a new kind of artificial and universally usable and consistent writing code. In spite of their assumed universality, there is some evidence that the sense of an emoji, specifically in regar d to sentiment, may change from language to language and culture to culture. This paper investigates whether contextual emoji sentiment analysis is consistent across Arabic and European languages. To conduct this investigation, we, first, created the Arabic emoji sentiment lexicon (Arab-ESL). Then, we exploited an existing European emoji sentiment lexicon to compare the sentiment conveyed in each of the two families of language and culture (Arabic and European). The results show that the pairwise correlation between the two lexicons is consistent for emoji that represent, for instance, hearts, facial expressions, and body language. However, for a subset of emoji (those that represent objects, nature, symbols, and some human activities), there are large differences in the sentiment conveyed. More interestingly, an extremely high level of inconsistency has been shown with food emoji.

european emoji sentiment emoji sentiment lexicon arabic emoji sentiment شعور الرموز التعبيري الأوروبي معجم الرموز التعبيرية المعجم مشاعر الرموز التعبيرية العربية صناعة حمض الفوسفور المزيد..

Data Integration for Toxic Comment Classification: Making More Than 40 Datasets Easily Accessible in One Unified Format

670 - Association for Computation Linguistics 2021 مقالة

With the rise of research on toxic comment classification, more and more annotated datasets have been released. The wide variety of the task (different languages, different labeling processes and schemes) has led to a large amount of heterogeneous da tasets that can be used for training and testing very specific settings. Despite recent efforts to create web pages that provide an overview, most publications still use only a single dataset. They are not stored in one central database, they come in many different data formats and it is difficult to interpret their class labels and how to reuse these labels in other projects. To overcome these issues, we present a collection of more than thirty datasets in the form of a software tool that automatizes downloading and processing of the data and presents them in a unified data format that also offers a mapping of compatible class labels. Another advantage of that tool is that it gives an overview of properties of available datasets, such as different languages, platforms, and class labels to make it easier to select suitable training and test data.

toxic comment classification datasets easily accessible easily accessible تصنيف سام التعليق مجموعات البيانات بسهولة يمكن الوصول إليها بسهولة صناعة حمض الفوسفور المزيد..

Understanding Model Robustness to User-generated Noisy Texts

939 - Association for Computation Linguistics 2021 مقالة

Sensitivity of deep-neural models to input noise is known to be a challenging problem. In NLP, model performance often deteriorates with naturally occurring noise, such as spelling errors. To mitigate this issue, models may leverage artificially nois ed data. However, the amount and type of generated noise has so far been determined arbitrarily. We therefore propose to model the errors statistically from grammatical-error-correction corpora. We present a thorough evaluation of several state-of-the-art NLP systems' robustness in multiple languages, with tasks including morpho-syntactic analysis, named entity recognition, neural machine translation, a subset of the GLUE benchmark and reading comprehension. We also compare two approaches to address the performance drop: a) training the NLP models with noised data generated by our framework; and b) reducing the input noise with external system for natural language correction. The code is released at https://github.com/ufal/kazitext.

user-generated noisy texts noisy texts user-generated noisy النصوص الناتجة عن المستخدم نصوص صاخبة صاخبة التي تم إنشاؤها صناعة حمض الفوسفور المزيد..

Natural SQL: Making SQL Easier to Infer from Natural Language Specifications

803 - Association for Computation Linguistics 2021 مقالة

Addressing the mismatch between natural language descriptions and the corresponding SQL queries is a key challenge for text-to-SQL translation. To bridge this gap, we propose an SQL intermediate representation (IR) called Natural SQL (NatSQL). Specif ically, NatSQL preserves the core functionalities of SQL, while it simplifies the queries as follows: (1) dispensing with operators and keywords such as GROUP BY, HAVING, FROM, JOIN ON, which are usually hard to find counterparts in the text descriptions; (2) removing the need of nested subqueries and set operators; and (3) making the schema linking easier by reducing the required number of schema items. On Spider, a challenging text-to-SQL benchmark that contains complex and nested SQL queries, we demonstrate that NatSQL outperforms other IRs, and significantly improves the performance of several previous SOTA models. Furthermore, for existing models that do not support executable SQL generation, NatSQL easily enables them to generate executable SQL queries, and achieves the new state-of-the-art execution accuracy.

natural language specifications language specifications making sql easier مواصفات اللغة الطبيعية مواصفات اللغة جعل SQL أسهل صناعة حمض الفوسفور المزيد..

Searching for More Efficient Dynamic Programs

781 - Association for Computation Linguistics 2021 مقالة

Computational models of human language often involve combinatorial problems. For instance, a probabilistic parser may marginalize over exponentially many trees to make predictions. Algorithms for such problems often employ dynamic programming and are not always unique. Finding one with optimal asymptotic runtime can be unintuitive, time-consuming, and error-prone. Our work aims to automate this laborious process. Given an initial correct declarative program, we search for a sequence of semantics-preserving transformations to improve its running time as much as possible. To this end, we describe a set of program transformations, a simple metric for assessing the efficiency of a transformed program, and a heuristic search procedure to improve this metric. We show that in practice, automated search---like the mental search performed by human programmers---can find substantial improvements to the initial program. Empirically, we show that many speed-ups described in the NLP literature could have been discovered automatically by our system.

efficient dynamic programs efficient dynamic efficient البرامج الديناميكية الفعالة ديناميكية فعالة فعال صناعة حمض الفوسفور المزيد..

Making Your Tweets More Fancy: Emoji Insertion to Texts

جعل تويتك أكثر يتوهم: إدراج الرموز التعبيرية للنصوص

Ask ChatGPT about the research

Read More

suggested questions