Do you want to publish a course? Click here

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Register a new user

Bayesian Optimization of Text Representations

411 0 0.0 ( 0 )

Download Cite

Added by Dani Yogatama

Publication date 2015

fields Informatics Engineering

and research's language is English

Authors Dani Yogatama - Noah A. Smith

Computation and Language Machine Learning Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

When applying machine learning to problems in NLP, there are many choices to make about how to represent input texts. These choices can have a big effect on performance, but they are often uninteresting to researchers or practitioners who simply need a module that performs well. We propose an approach to optimizing over this space of choices, formulating the problem as global optimization. We apply a sequential model-based optimization technique and show that our method makes standard linear models competitive with more sophisticated, expensive state-of-the-art methods based on latent variable models or neural networks on various topic classification and sentiment analysis problems. Our approach is a first step towards black-box NLP systems that work with raw text and do not require manual tuning.

rate research

Read More

Disentangling Representations of Text by Masking Transformers

108 - Xiongyi Zhang , Jan-Willem van de Meent , Byron C. Wallace 2021

Representations from large pretrained models such as BERT encode a range of features into monolithic vectors, affording strong predictive accuracy across a multitude of downstream tasks. In this paper we explore whether it is possible to learn disentangled representations by identifying existing subnetworks within pretrained models that encode distinct, complementary aspect representations. Concretely, we learn binary masks over transformer weights or hidden units to uncover subsets of features that correlate with a specific factor of variation; this eliminates the need to train a disentangled model from scratch for a particular task. We evaluate this method with respect to its ability to disentangle representations of sentiment from genre in movie reviews, toxicity from dialect in Tweets, and syntax from semantics. By combining masking with magnitude pruning we find that we can identify sparse subnetworks within BERT that strongly encode particular aspects (e.g., toxicity) while only weakly encoding others (e.g., race). Moreover, despite only learning masks, we find that disentanglement-via-masking performs as well as -- and often better than -- previously proposed methods based on variational autoencoders and adversarial training.

Computation and Language Machine Learning

Kernelized Bayesian Softmax for Text Generation

272 - Ning Miao , Hao Zhou , Chengqi Zhao 2019

Neural models for text generation require a softmax layer with proper token embeddings during the decoding phase. Most existing approaches adopt single point embedding for each token. However, a word may have multiple senses according to different context, some of which might be distinct. In this paper, we propose KerBS, a novel approach for learning better embeddings for text generation. KerBS embodies two advantages: (a) it employs a Bayesian composition of embeddings for words with multiple senses; (b) it is adaptive to semantic variances of words and robust to rare sentence context by imposing learned kernels to capture the closeness of words (senses) in the embedding space. Empirical studies show that KerBS significantly boosts the performance of several text generation tasks.

Computation and Language Machine Learning

Comparing Text Representations: A Theory-Driven Approach

65 - Gregory Yauney , David Mimno 2021

Much of the progress in contemporary NLP has come from learning representations, such as masked language model (MLM) contextual embeddings, that turn challenging problems into simple classification tasks. But how do we quantify and explain this effect? We adapt general tools from computational learning theory to fit the specific characteristics of text datasets and present a method to evaluate the compatibility between representations and tasks. Even though many tasks can be easily solved with simple bag-of-words (BOW) representations, BOW does poorly on hard natural language inference tasks. For one such task we find that BOW cannot distinguish between real and randomized labelings, while pre-trained MLM representations show 72x greater distinction between real and random labelings than BOW. This method provides a calibrated, quantitative measure of the difficulty of a classification-based NLP task, enabling comparisons between representations without requiring empirical evaluations that may be sensitive to initializations and hyperparameters. The method provides a fresh perspective on the patterns in a dataset and the alignment of those patterns with specific labels.

Computation and Language Machine Learning

On Variational Learning of Controllable Representations for Text without Supervision

110 - Peng Xu , Jackie Chi Kit Cheung , Yanshuai Cao 2019

The variational autoencoder (VAE) can learn the manifold of natural images on certain datasets, as evidenced by meaningful interpolating or extrapolating in the continuous latent space. However, on discrete data such as text, it is unclear if unsupervised learning can discover similar latent space that allows controllable manipulation. In this work, we find that sequence VAEs trained on text fail to properly decode when the latent codes are manipulated, because the modified codes often land in holes or vacant regions in the aggregated posterior latent space, where the decoding network fails to generalize. Both as a validation of the explanation and as a fix to the problem, we propose to constrain the posterior mean to a learned probability simplex, and performs manipulation within this simplex. Our proposed method mitigates the latent vacancy problem and achieves the first success in unsupervised learning of controllable representations for text. Empirically, our method outperforms unsupervised baselines and strong supervised approaches on text style transfer, and is capable of performing more flexible fine-grained control over text generation than existing methods.

Computation and Language Machine Learning

On-Device Text Representations Robust To Misspellings via Projections

109 - Chinnadhurai Sankar , Sujith Ravi , Zornitsa Kozareva 2019

Recently, there has been a strong interest in developing natural language applications that live on personal devices such as mobile phones, watches and IoT with the objective to preserve user privacy and have low memory. Advances in Locality-Sensitive Hashing (LSH)-based projection networks have demonstrated state-of-the-art performance in various classification tasks without explicit word (or word-piece) embedding lookup tables by computing on-the-fly text representations. In this paper, we show that the projection based neural classifiers are inherently robust to misspellings and perturbations of the input text. We empirically demonstrate that the LSH projection based classifiers are more robust to common misspellings compared to BiLSTMs (with both word-piece & word-only tokenization) and fine-tuned BERT based methods. When subject to misspelling attacks, LSH projection based classifiers had a small average accuracy drop of 2.94% across multiple classifications tasks, while the fine-tuned BERT model accuracy had a significant drop of 11.44%.

Computation and Language Machine Learning

suggested questions

ما العلاقة بين الذكاء الاصطناعي وتعلم الآلة؟

2004 - 0 - - Shamra Editor was published in field ( Informatics Engineering)

التعلم الآلي

ماذا يعني التنقيب عن البيانات؟

2371 - 0 - - Ahmad Ali was published in field ( Informatics Engineering)

التعلم الآلي

ماهي وسائل التنقيب في البيانات؟

2116 - 0 - - Ahmad Ali was published in field ( Informatics Engineering)

التعلم الآلي

Log in to be able to interact and post comments

comments

Fetching comments

Fetching comments

Sign in to be able to follow your search criteria

Higher Institute for Demographic Studies and Researches

Additional details More universities

mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا

نعم | كلا