Learning Word Representations with Hierarchical Sparse Coding

307 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Dani Yogatama

تاريخ النشر 2014

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Dani Yogatama - Manaal Faruqui - Chris Dyer

قم بزيارة صفحتنا على فيسبوك

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We propose a new method for learning word representations using hierarchical regularization in sparse coding inspired by the linguistic study of word meanings. We show an efficient learning algorithm based on stochastic proximal methods that is significantly faster than previous approaches, making it possible to perform hierarchical sparse coding on a corpus of billions of word tokens. Experiments on various benchmark tasks---word similarity ranking, analogies, sentence completion, and sentiment analysis---demonstrate that the method outperforms or is competitive with state-of-the-art methods. Our word representations are available at url{http://www.ark.cs.cmu.edu/dyogatam/wordvecs/}.

قيم البحث

106 - Yunchuan Chen , Lili Mou , Yan Xu 2016

Neural networks are among the state-of-the-art techniques for language modeling. Existing neural language models typically map discrete words to distributed, dense vector representations. After information processing of the preceding context words by hidden layers, an output layer estimates the probability of the next word. Such approaches are time- and memory-intensive because of the large numbers of parameters for word embeddings and the output layer. In this paper, we propose to compress neural language models by sparse word representations. In the experiments, the number of parameters in our model increases very slowly with the growth of the vocabulary size, which is almost imperceptible. Moreover, our approach not only reduces the parameter space to a large extent, but also improves the performance in terms of the perplexity measure.

الحساب واللغة التعلم الآلي

Sparse Lifting of Dense Vectors: Unifying Word and Sentence Representations

110 - Wenye Li , Senyue Hao 2019

As the first step in automated natural language processing, representing words and sentences is of central importance and has attracted significant research attention. Different approaches, from the early one-hot and bag-of-words representation to mo re recent distributional dense and sparse representations, were proposed. Despite the successful results that have been achieved, such vectors tend to consist of uninterpretable components and face nontrivial challenge in both memory and computational requirement in practical applications. In this paper, we designed a novel representation model that projects dense word vectors into a higher dimensional space and favors a highly sparse and binary representation of word vectors with potentially interpretable components, while trying to maintain pairwise inner products between original vectors as much as possible. Computationally, our model is relaxed as a symmetric non-negative matrix factorization problem which admits a fast yet effective solution. In a series of empirical evaluations, the proposed model exhibited consistent improvement and high potential in practical applications.

الحساب واللغة التعلم الآلي

Learning Gender-Neutral Word Embeddings

104 - Jieyu Zhao , Yichao Zhou , Zeyu Li 2018

Word embedding models have become a fundamental component in a wide range of Natural Language Processing (NLP) applications. However, embeddings trained on human-generated corpora have been demonstrated to inherit strong gender stereotypes that refle ct social constructs. To address this concern, in this paper, we propose a novel training procedure for learning gender-neutral word embeddings. Our approach aims to preserve gender information in certain dimensions of word vectors while compelling other dimensions to be free of gender influence. Based on the proposed method, we generate a Gender-Neutral variant of GloVe (GN-GloVe). Quantitative and qualitative experiments demonstrate that GN-GloVe successfully isolates gender information without sacrificing the functionality of the embedding model.

الحساب واللغة التعلم الآلي التعلم الالي

Learning Dynamic Author Representations with Temporal Language Models

98 - Edouard Delasalles , Sylvain Lamprier , Ludovic Denoyer 2019

Language models are at the heart of numerous works, notably in the text mining and information retrieval communities. These statistical models aim at extracting word distributions, from simple unigram models to recurrent approaches with latent variab les that capture subtle dependencies in texts. However, those models are learned from word sequences only, and authors identities, as well as publication dates, are seldom considered. We propose a neural model, based on recurrent language modeling, which aims at capturing language diffusion tendencies in author communities through time. By conditioning language models with author and temporal vector states, we are able to leverage the latent dependencies between the text contexts. This allows us to beat several temporal and non-temporal language baselines on two real-world corpora, and to learn meaningful author representations that vary through time.

الحساب واللغة التعلم الآلي التعلم الالي

Composable Learning with Sparse Kernel Representations

161 - Ekaterina Tolstaya , Ethan Stump , Alec Koppel 2021

We present a reinforcement learning algorithm for learning sparse non-parametric controllers in a Reproducing Kernel Hilbert Space. We improve the sample complexity of this approach by imposing a structure of the state-action function through a norma lized advantage function (NAF). This representation of the policy enables efficiently composing multiple learned models without additional training samples or interaction with the environment. We demonstrate the performance of this algorithm on learning obstacle-avoidance policies in multiple simulations of a robot equipped with a laser scanner while navigating in a 2D environment. We apply the composition operation to various policy combinations and test them to show that the composed policies retain the performance of their components. We also transfer the composed policy directly to a physical platform operating in an arena with obstacles in order to demonstrate a degree of generalization.

علم الروبوتات التعلم الآلي