Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

A Study on Using Semantic Word Associations to Predict the Success of a Novel

دراسة حول استخدام جمعيات الكلمات الدلالية للتنبؤ بنجاح رواية

512 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

semantic word associations book success prediction book success جمعيات الكلمة الدلالية كتاب التنبؤ بالنجاح كتاب النجاح صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

يتم نشر العديد من الكتب الجديدة كل عام، وفقط جزء صغير منهم يصبح شعبية بين القراء. لذلك يمكن أن يكون التنبؤ لنجاح الكتاب معلمة مفيدة للغاية للناشرين لاتخاذ قرار موثوق. تقدم هذه المقالة دراسة جمعيات الكلمات الدلالية باستخدام كلمة تضمين محتوى الكتاب لمجموعة من مفاهيم رسائل المرادفات Roget لتنبؤ نجاح الكتاب. في هذا العمل، نناقش الطريقة لتمثيل كتاب كطيف من المفاهيم بناء على درجة الجمعية بين تضمين محتواها ومضمون عالمي (I.E. FastText) لمجموعة من مجموعات الكلمات المرتبطة بشكل شبه مرتبط. نظهر أن جمعيات الكلمة الدلالية تتفوق على الطرق السابقة لكتاب التنبؤ بنجاح. بالإضافة إلى ذلك، نقدم أن جمعيات الكلمات الدلالية توفر أيضا نتائج أفضل من استخدام ميزات مثل تواتر مجموعات الكلمات في مرادس روغيت، Liwc (أداة شعبية للاستفسار اللغوي وعدد الكلمات)، NRC (Word Association Emotion lexicon)، وجزء من كلام (نقاط البيع). تقارير دراستنا أن رابطات المفاهيم القائمة على مرادفات روغيت باستخدام كلمة تضمين الرواية الفردية نتجت عن أداء الحديث من 0.89 متوسط النتيجة F1 المرجحة لتقويت نجاح الكتاب. أخيرا، نقدم مجموعة من الموضوعات المهيمنة التي تسهم في شعبية كتاب عن نوع معين.

Many new books get published every year, and only a fraction of them become popular among the readers. So the prediction of a book success can be a very useful parameter for publishers to make a reliable decision. This article presents the study of semantic word associations using the word embedding of book content for a set of Roget's thesaurus concepts for book success prediction. In this work, we discuss the method to represent a book as a spectrum of concepts based on the association score between its content embedding and a global embedding (i.e. fastText) for a set of semantically linked word clusters. We show that the semantic word associations outperform the previous methods for book success prediction. In addition, we present that semantic word associations also provide better results than using features like the frequency of word groups in Roget's thesaurus, LIWC (a popular tool for linguistic inquiry and word count), NRC (word association emotion lexicon), and part of speech (PoS). Our study reports that concept associations based on Roget's Thesaurus using word embedding of individual novel resulted in the state-of-the-art performance of 0.89 average weighted F1-score for book success prediction. Finally, we present a set of dominant themes that contribute towards the popularity of a book for a specific genre.

References used

https://aclanthology.org/

rate research

hub at SemEval-2021 Task 1: Fusion of Sentence and Word Frequency to Predict Lexical Complexity

745 - Association for Computation Linguistics 2021 مقالة

In this paper, we propose a method of fusing sentence information and word frequency information for the SemEval 2021 Task 1-Lexical Complexity Prediction (LCP) shared task. In our system, the sentence information comes from the RoBERTa model, and th e word frequency information comes from the Tf-Idf algorithm. Use Inception block as a shared layer to learn sentence and word frequency information We described the implementation of our best system and discussed our methods and experiments in the task. The shared task is divided into two sub-tasks. The goal of the two sub-tasks is to predict the complexity of a predetermined word. The shared task is divided into two subtasks. The goal of the two subtasks is to predict the complexity of a predetermined word. The evaluation index of the task is the Pearson correlation coefficient. Our best performance system has Pearson correlation coefficients of 0.7434 and 0.8000 in the single-token subtask test set and the multi-token subtask test set, respectively.

word frequency information predict lexical complexity word frequency معلومات تردد كلمة توقع التعقيد المعجمي كلمة تردد صناعة حمض الفوسفور المزيد..

A Brief Study on the Effects of Training Generative Dialogue Models with a Semantic loss

686 - Association for Computation Linguistics 2021 مقالة

Neural models trained for next utterance generation in dialogue task learn to mimic the n-gram sequences in the training set with training objectives like negative log-likelihood (NLL) or cross-entropy. Such commonly used training objectives do not f oster generating alternate responses to a context. But, the effects of minimizing an alternate training objective that fosters a model to generate alternate response and score it on semantic similarity has not been well studied. We hypothesize that a language generation model can improve on its diversity by learning to generate alternate text during training and minimizing a semantic loss as an auxiliary objective. We explore this idea on two different sized data sets on the task of next utterance generation in goal oriented dialogues. We make two observations (1) minimizing a semantic objective improved diversity in responses in the smaller data set (Frames) but only as-good-as minimizing the NLL in the larger data set (MultiWoZ) (2) large language model embeddings can be more useful as a semantic loss objective than as initialization for token embeddings.

training generative dialogue generative dialogue models generative dialogue تدريب الحوار المولاد نماذج الحوار المنتج الحوار المولد صناعة حمض الفوسفور المزيد..

Statistically Significant Detection of Semantic Shifts using Contextual Word Embeddings

814 - Association for Computation Linguistics 2021 مقالة

Detecting lexical semantic change in smaller data sets, e.g. in historical linguistics and digital humanities, is challenging due to a lack of statistical power. This issue is exacerbated by non-contextual embedding models that produce one embedding per word and, therefore, mask the variability present in the data. In this article, we propose an approach to estimate semantic shift by combining contextual word embeddings with permutation-based statistical tests. We use the false discovery rate procedure to address the large number of hypothesis tests being conducted simultaneously. We demonstrate the performance of this approach in simulation where it achieves consistently high precision by suppressing false positives. We additionally analyze real-world data from SemEval-2020 Task 1 and the Liverpool FC subreddit corpus. We show that by taking sample variation into account, we can improve the robustness of individual semantic shift estimates without degrading overall performance.

statistically significant detection significant detection statistically significant الكشف ذات دلالة إحصائية الكشف عن كبير ذات دلالة إحصائية صناعة حمض الفوسفور المزيد..

Using Noisy Self-Reports to Predict Twitter User Demographics

509 - Association for Computation Linguistics 2021 مقالة

Computational social science studies often contextualize content analysis within standard demographics. Since demographics are unavailable on many social media platforms (e.g. Twitter), numerous studies have inferred demographics automatically. Despi te many studies presenting proof-of-concept inference of race and ethnicity, training of practical systems remains elusive since there are few annotated datasets. Existing datasets are small, inaccurate, or fail to cover the four most common racial and ethnic groups in the United States. We present a method to identify self-reports of race and ethnicity from Twitter profile descriptions. Despite the noise of automated supervision, our self-report datasets enable improvements in classification performance on gold standard self-report survey data. The result is a reproducible method for creating large-scale training resources for race and ethnicity.

predict twitter user twitter user demographics predict twitter توقع مستخدم Twitter Twitter المستخدم التركيبة السكانية توقع Twitter. صناعة حمض الفوسفور المزيد..

A Novel Framework for Detecting Important Subevents from Crisis Events via Dynamic Semantic Graphs

620 - Association for Computation Linguistics 2021 مقالة

Social media is an essential tool to share information about crisis events, such as natural disasters. Event Detection aims at extracting information in the form of an event, but considers each event in isolation, without combining information across sentences or events. Many posts in Crisis NLP contain repetitive or complementary information which needs to be aggregated (e.g., the number of trapped people and their location) for disaster response. Although previous approaches in Crisis NLP aggregate information across posts, they only use shallow representations of the content (e.g., keywords), which cannot adequately represent the semantics of a crisis event and its sub-events. In this work, we propose a novel framework to extract critical sub-events from a large-scale crisis event by combining important information across relevant tweets. Our framework first converts all the tweets from a crisis event into a temporally-ordered set of graphs. Then it extracts sub-graphs that represent semantic relationships connecting verbs and nouns in 3 to 6 node sub-graphs. It does this by learning edge weights via Dynamic Graph Convolutional Networks (DGCNs) and extracting smaller, relevant sub-graphs. Our experiments show that our extracted structures (1) are semantically meaningful sub-events and (2) contain information important for the large crisis-event. Furthermore, we show that our approach significantly outperforms event detection baselines, highlighting the importance of aggregating information across tweets for our task.

detecting important subevents detecting important important subevents اكتشاف شبه مهم الكشف عن مهم دعوى مهمة صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

A Study on Using Semantic Word Associations to Predict the Success of a Novel

دراسة حول استخدام جمعيات الكلمات الدلالية للتنبؤ بنجاح رواية

Ask ChatGPT about the research

Read More

suggested questions