Squared English Word: A Method of Generating Glyph to Use Super Characters for Sentiment Analysis

196 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Baohua Sun

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Baohua Sun - Lin Yang - Catherine Chi

الحساب واللغة

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The Super Characters method addresses sentiment analysis problems by first converting the input text into images and then applying 2D-CNN models to classify the sentiment. It achieves state of the art performance on many benchmark datasets. However, it is not as straightforward to apply in Latin languages as in Asian languages. Because the 2D-CNN model is designed to recognize two-dimensional images, it is better if the inputs are in the form of glyphs. In this paper, we propose SEW (Squared English Word) method generating a squared glyph for each English word by drawing Super Characters images of each English word at the alphabet level, combining the squared glyph together into a whole Super Characters image at the sentence level, and then applying the CNN model to classify the sentiment within the sentence. We applied the SEW method to Wikipedia dataset and obtained a 2.1% accuracy gain compared to the original Super Characters method. For multi-modal data with both structured tabular data and unstructured natural language text, the modified SEW method integrates the data into a single image and classifies sentiment with one unified CNN model.

قيم البحث

135 - Baohua Sun , Lin Yang , Hao Sha 2020

Recent years NLP research has witnessed the record-breaking accuracy improvement by DNN models. However, power consumption is one of the practical concerns for deploying NLP systems. Most of the current state-of-the-art algorithms are implemented on GPUs, which is not power-efficient and the deployment cost is also very high. On the other hand, CNN Domain Specific Accelerator (CNN-DSA) has been in mass production providing low-power and low cost computation power. In this paper, we will implement the Super Characters method on the CNN-DSA. In addition, we modify the Super Characters method to utilize the multi-modal data, i.e. text plus tabular data in the CL-Aff sharedtask.

الحساب واللغة

Glyph-aware Embedding of Chinese Characters

74 - Falcon Z. Dai , Zheng Cai 2017

Given the advantage and recent success of English character-level and subword-unit models in several NLP tasks, we consider the equivalent modeling problem for Chinese. Chinese script is logographic and many Chinese logograms are composed of common s ubstructures that provide semantic, phonetic and syntactic hints. In this work, we propose to explicitly incorporate the visual appearance of a characters glyph in its representation, resulting in a novel glyph-aware embedding of Chinese characters. Being inspired by the success of convolutional neural networks in computer vision, we use them to incorporate the spatio-structural patterns of Chinese glyphs as rendered in raw pixels. In the context of two basic Chinese NLP tasks of language modeling and word segmentation, the model learns to represent each characters task-relevant semantic and syntactic information in the character-level embedding.

الحساب واللغة التعلم الآلي

Making the Best Use of Review Summary for Sentiment Analysis

92 - Sen Yang , Leyang Cui , Jun Xie 2019

Sentiment analysis provides a useful overview of customer review contents. Many review websites allow a user to enter a summary in addition to a full review. Intuitively, summary information may give additional benefit for review sentiment analysis. In this paper, we conduct a study to exploit methods for better use of summary information. We start by finding out that the sentimental signal distribution of a review and that of its corresponding summary are in fact complementary to each other. We thus explore various architectures to better guide the interactions between the two and propose a hierarchically-refined review-centric attention model. Empirical results show that our review-centric model can make better use of user-written summaries for review sentiment analysis, and is also more effective compared to existing methods when the user summary is replaced with summary generated by an automatic summarization system.

الحساب واللغة

On the Effect of Word Order on Cross-lingual Sentiment Analysis

167 - `Alex R. Atrio , Toni Badia , Jeremy Barnes 2019

Current state-of-the-art models for sentiment analysis make use of word order either explicitly by pre-training on a language modeling objective or implicitly by using recurrent neural networks (RNNs) or convolutional networks (CNNs). This is a probl em for cross-lingual models that use bilingual embeddings as features, as the difference in word order between source and target languages is not resolved. In this work, we explore reordering as a pre-processing step for sentence-level cross-lingual sentiment classification with two language combinations (English-Spanish, English-Catalan). We find that while reordering helps both models, CNNS are more sensitive to local reorderings, while global reordering benefits RNNs.

الحساب واللغة

SemAxis: A Lightweight Framework to Characterize Domain-Specific Word Semantics Beyond Sentiment

213 - Jisun An , Haewoon Kwak , Yong-Yeol Ahn 2018

Because word semantics can substantially change across communities and contexts, capturing domain-specific word semantics is an important challenge. Here, we propose SEMAXIS, a simple yet powerful framework to characterize word semantics using many s emantic axes in word- vector spaces beyond sentiment. We demonstrate that SEMAXIS can capture nuanced semantic representations in multiple online communities. We also show that, when the sentiment axis is examined, SEMAXIS outperforms the state-of-the-art approaches in building domain-specific sentiment lexicons.

الحساب واللغة أجهزة الكمبيوتر والمجتمع

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة حلوان

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Squared English Word: A Method of Generating Glyph to Use Super Characters for Sentiment Analysis

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً