Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Comparing Text Representations: A Theory-Driven Approach

مقارنة بتمثيل النص: نهج يحركه النظرية

600 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

theory-driven approach comparing text representations comparing text النهج النظرية مقارنة تمثيلات النص مقارنة النص صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Much of the progress in contemporary NLP has come from learning representations, such as masked language model (MLM) contextual embeddings, that turn challenging problems into simple classification tasks. But how do we quantify and explain this effect? We adapt general tools from computational learning theory to fit the specific characteristics of text datasets and present a method to evaluate the compatibility between representations and tasks. Even though many tasks can be easily solved with simple bag-of-words (BOW) representations, BOW does poorly on hard natural language inference tasks. For one such task we find that BOW cannot distinguish between real and randomized labelings, while pre-trained MLM representations show 72x greater distinction between real and random labelings than BOW. This method provides a calibrated, quantitative measure of the difficulty of a classification-based NLP task, enabling comparisons between representations without requiring empirical evaluations that may be sensitive to initializations and hyperparameters. The method provides a fresh perspective on the patterns in a dataset and the alignment of those patterns with specific labels.

References used

https://aclanthology.org/

rate research

A Text Editing Approach to Joint Japanese Word Segmentation, POS Tagging, and Lexical Normalization

1044 - Association for Computation Linguistics 2021 مقالة

Lexical normalization, in addition to word segmentation and part-of-speech tagging, is a fundamental task for Japanese user-generated text processing. In this paper, we propose a text editing model to solve the three task jointly and methods of pseud o-labeled data generation to overcome the problem of data deficiency. Our experiments showed that the proposed model achieved better normalization performance when trained on more diverse pseudo-labeled data.

joint japanese word approach to joint text editing approach كلمة اليابانية المشتركة النهج إلى المشتركة نهج تحرير النص صناعة حمض الفوسفور المزيد..

Textual Time Travel: A Temporally Informed Approach to Theory of Mind

663 - Association for Computation Linguistics 2021 مقالة

Natural language processing systems such as dialogue agents should be able to reason about other people's beliefs, intentions and desires. This capability, called theory of mind (ToM), is crucial, as it allows a model to predict and interpret the nee ds of users based on their mental states. A recent line of research evaluates the ToM capability of existing memory-augmented neural models through question-answering. These models perform poorly on false belief tasks where beliefs differ from reality, especially when the dataset contains distracting sentences. In this paper, we propose a new temporally informed approach for improving the ToM capability of memory-augmented neural models. Our model incorporates priors about the entities' minds and tracks their mental states as they evolve over time through an extended passage. It then responds to queries through textual time travel--i.e., by accessing the stored memory of an earlier time step. We evaluate our model on ToM datasets and find that this approach improves performance, particularly by correcting the predicted mental states to match the false belief.

temporally informed approach temporally informed textual time travel نهج مستنير مؤقتا أبلغ مؤقتا السفر النص النصي صناعة حمض الفوسفور المزيد..

Disentangling Representations of Text by Masking Transformers

691 - Association for Computation Linguistics 2021 مقالة

Representations from large pretrained models such as BERT encode a range of features into monolithic vectors, affording strong predictive accuracy across a range of downstream tasks. In this paper we explore whether it is possible to learn disentangl ed representations by identifying existing subnetworks within pretrained models that encode distinct, complementary aspects. Concretely, we learn binary masks over transformer weights or hidden units to uncover subsets of features that correlate with a specific factor of variation; this eliminates the need to train a disentangled model from scratch for a particular task. We evaluate this method with respect to its ability to disentangle representations of sentiment from genre in movie reviews, toxicity from dialect in Tweets, and syntax from semantics. By combining masking with magnitude pruning we find that we can identify sparse subnetworks within BERT that strongly encode particular aspects (e.g., semantics) while only weakly encoding others (e.g., syntax). Moreover, despite only learning masks, disentanglement-via-masking performs as well as --- and often better than ---previously proposed methods based on variational autoencoders and adversarial training.

disentangling representations representations of text text by masking تمثيلات منحنية تمثيلات النص النص عن طريق إخفاء صناعة حمض الفوسفور المزيد..

Text Style Transfer: Leveraging a Style Classifier on Entangled Latent Representations

874 - Association for Computation Linguistics 2021 مقالة

Learning a good latent representation is essential for text style transfer, which generates a new sentence by changing the attributes of a given sentence while preserving its content. Most previous works adopt disentangled latent representation learn ing to realize style transfer. We propose a novel text style transfer algorithm with entangled latent representation, and introduce a style classifier that can regulate the latent structure and transfer style. Moreover, our algorithm for style transfer applies to both single-attribute and multi-attribute transfer. Extensive experimental results show that our method generally outperforms state-of-the-art approaches.

text style transfer style transfer latent representation نقل نمط النص نقل النمط التمثيل الكامن صناعة حمض الفوسفور المزيد..

SAPPHIRE: Approaches for Enhanced Concept-to-Text Generation

783 - Association for Computation Linguistics 2021 مقالة

We motivate and propose a suite of simple but effective improvements for concept-to-text generation called SAPPHIRE: Set Augmentation and Post-hoc PHrase Infilling and REcombination. We demonstrate their effectiveness on generative commonsense reason ing, a.k.a. the CommonGen task, through experiments using both BART and T5 models. Through extensive automatic and human evaluation, we show that SAPPHIRE noticeably improves model performance. An in-depth qualitative analysis illustrates that SAPPHIRE effectively addresses many issues of the baseline model generations, including lack of commonsense, insufficient specificity, and poor fluency.

approaches for enhanced post-hoc phrase infilling generation called sapphire نهج لتعزيز عبارة ما بعد الهي جيل يسمى الياقوت صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Comparing Text Representations: A Theory-Driven Approach

مقارنة بتمثيل النص: نهج يحركه النظرية

Ask ChatGPT about the research

Read More

suggested questions