Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

TextEssence: A Tool for Interactive Analysis of Semantic Shifts Between Corpora

Textessence: أداة للتحليل التفاعلي للتحولات الدلالية بين كوربورا

332 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

semantic shifts shifts between corpora shifts التحولات الدلالية التحولات بين Corpora. صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Embeddings of words and concepts capture syntactic and semantic regularities of language; however, they have seen limited use as tools to study characteristics of different corpora and how they relate to one another. We introduce TextEssence, an interactive system designed to enable comparative analysis of corpora using embeddings. TextEssence includes visual, neighbor-based, and similarity-based modes of embedding analysis in a lightweight, web-based interface. We further propose a new measure of embedding confidence based on nearest neighborhood overlap, to assist in identifying high-quality embeddings for corpus analysis. A case study on COVID-19 scientific literature illustrates the utility of the system. TextEssence can be found at https://textessence.github.io.

References used

https://aclanthology.org/

rate research

Statistically Significant Detection of Semantic Shifts using Contextual Word Embeddings

835 - Association for Computation Linguistics 2021 مقالة

Detecting lexical semantic change in smaller data sets, e.g. in historical linguistics and digital humanities, is challenging due to a lack of statistical power. This issue is exacerbated by non-contextual embedding models that produce one embedding per word and, therefore, mask the variability present in the data. In this article, we propose an approach to estimate semantic shift by combining contextual word embeddings with permutation-based statistical tests. We use the false discovery rate procedure to address the large number of hypothesis tests being conducted simultaneously. We demonstrate the performance of this approach in simulation where it achieves consistently high precision by suppressing false positives. We additionally analyze real-world data from SemEval-2020 Task 1 and the Liverpool FC subreddit corpus. We show that by taking sample variation into account, we can improve the robustness of individual semantic shift estimates without degrading overall performance.

statistically significant detection significant detection statistically significant الكشف ذات دلالة إحصائية الكشف عن كبير ذات دلالة إحصائية صناعة حمض الفوسفور المزيد..

Interactive Learning Approach for Arabic Target-Based Sentiment Analysis

822 - Association for Computation Linguistics 2021 مقالة

Recently, the majority of sentiment analysis researchers focus on target-based sentiment analysis because it delivers in-depth analysis with more accurate results as compared to traditional sentiment analysis. In this paper, we propose an interactive learning approach to tackle a target-based sentiment analysis task for the Arabic language. The proposed IA-LSTM model uses an interactive attention-based mechanism to force the model to focus on different parts (targets) of a sentence. We investigate the ability to use targets, right, and left context, and model them separately to learn their own representations via interactive modeling. We evaluated our model on two different datasets: Arabic hotel review and Arabic book review datasets. The results demonstrate the effectiveness of using this interactive modeling technique for the Arabic target-based task. The model obtained accuracy values of 83.10 compared to SOTA models such as AB-LSTM-PC which obtained 82.60 for the same dataset.

target-based sentiment analysis interactive learning approach تحليل المعنويات المستندة إلى الهدف نهج التعلم التفاعلي صناعة حمض الفوسفور

CombAlign: a Tool for Obtaining High-Quality Word Alignments

1178 - Association for Computation Linguistics 2021 مقالة

Being able to generate accurate word alignments is useful for a variety of tasks. While statistical word aligners can work well, especially when parallel training data are plentiful, multilingual embedding models have recently been shown to give good results in unsupervised scenarios. We evaluate an ensemble method for word alignment on four language pairs and demonstrate that by combining multiple tools, taking advantage of their different approaches, substantial gains can be made. This holds for settings ranging from very low-resource to high-resource. Furthermore, we introduce a new gold alignment test set for Icelandic and a new easy-to-use tool for creating manual word alignments.

obtaining high-quality word obtaining high-quality high-quality word alignments الحصول على كلمة عالية الجودة الحصول على جودة عالية محاذاة كلمة عالية الجودة صناعة حمض الفوسفور المزيد..

A Universal Dependencies Corpora Maintenance Methodology Using Downstream Application

661 - Association for Computation Linguistics 2021 مقالة

This paper investigates updates of Universal Dependencies (UD) treebanks in 23 languages and their impact on a downstream application. Numerous people are involved in updating UD's annotation guidelines and treebanks in various languages. However, it is not easy to verify whether the updated resources maintain universality with other language resources. Thus, validity and consistency of multilingual corpora should be tested through application tasks involving syntactic structures with PoS tags, dependency labels, and universal features. We apply the syntactic parsers trained on UD treebanks from multiple versions (2.0 to 2.7) to a clause-level sentiment extractor. We then analyze the relationships between attachment scores of dependency parsers and performance in application tasks. For future UD developments, we show examples of outputs that differ depending on version.

corpora maintenance methodology dependencies corpora maintenance maintenance methodology منهجية صيانة كورسا التبعيات الصيانة كوربورا منهجية الصيانة صناعة حمض الفوسفور المزيد..

FrenLyS: A Tool for the Automatic Simplification of French General Language Texts

662 - Association for Computation Linguistics 2021 مقالة

Lexical simplification (LS) aims at replacing words considered complex in a sentence by simpler equivalents. In this paper, we present the first automatic LS service for French, FrenLys, which offers different techniques to generate, select and rank substitutes. The paper describes the different methods proposed by our tool, which includes both classical approaches (e.g. generation of candidates from lexical resources, frequency filter, etc.) and more innovative approaches such as the exploitation of CamemBERT, a model for French based on the RoBERTa architecture. To evaluate the different methods, a new evaluation dataset for French is introduced.

general language texts french general language نصوص اللغة العامة اللغة الفرنسية اللغة العامة صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

TextEssence: A Tool for Interactive Analysis of Semantic Shifts Between Corpora

Textessence: أداة للتحليل التفاعلي للتحولات الدلالية بين كوربورا

Ask ChatGPT about the research

Read More

suggested questions