Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Analyzing machine-learned representations: A natural language case study

81 0 0.0 ( 0 )

Download Cite

Added by Ishita Dasgupta

Publication date 2019

fields Informatics Engineering

and research's language is English

Authors Ishita Dasgupta - Demi Guo - Samuel J. Gershman

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

As modern deep networks become more complex, and get closer to human-like capabilities in certain domains, the question arises of how the representations and decision rules they learn compare to the ones in humans. In this work, we study representations of sentences in one such artificial system for natural language processing. We first present a diagnostic test dataset to examine the degree of abstract composable structure represented. Analyzing performance on these diagnostic tests indicates a lack of systematicity in the representations and decision rules, and reveals a set of heuristic strategies. We then investigate the effect of the training distribution on learning these heuristic strategies, and study changes in these representations with various augmentations to the training set. Our results reveal parallels to the analogous representations in people. We find that these systems can learn abstract rules and generalize them to new contexts under certain circumstances -- similar to human zero-shot reasoning. However, we also note some shortcomings in this generalization behavior -- similar to human judgment errors like belief bias. Studying these parallels suggests new ways to understand psychological phenomena in humans as well as informs best strategies for building artificial intelligence with human-like language understanding.

rate research

Deep Bayesian Active Learning for Natural Language Processing: Results of a Large-Scale Empirical Study

164 - Aditya Siddhant , Zachary C. Lipton 2018

Several recent papers investigate Active Learning (AL) for mitigating the data dependence of deep learning for natural language processing. However, the applicability of AL to real-world problems remains an open question. While in supervised learning, practitioners can try many different methods, evaluating each against a validation set before selecting a model, AL affords no such luxury. Over the course of one AL run, an agent annotates its dataset exhausting its labeling budget. Thus, given a new task, an active learner has no opportunity to compare models and acquisition functions. This paper provides a large scale empirical study of deep active learning, addressing multiple tasks and, for each, multiple datasets, multiple models, and a full suite of acquisition functions. We find that across all settings, Bayesian active learning by disagreement, using uncertainty estimates provided either by Dropout or Bayes-by Backprop significantly improves over i.i.d. baselines and usually outperforms classic uncertainty sampling.

Computation and Language Machine Learning Machine Learning

Analyzing Language Learned by an Active Question Answering Agent

78 - Christian Buck , Jannis Bulian , Massimiliano Ciaramita 2018

We analyze the language learned by an agent trained with reinforcement learning as a component of the ActiveQA system [Buck et al., 2017]. In ActiveQA, question answering is framed as a reinforcement learning task in which an agent sits between the user and a black box question-answering system. The agent learns to reformulate the users questions to elicit the optimal answers. It probes the system with ma

Computation and Language Artificial Intelligence

Temporal Meaning Representations in a Natural Language Front-End

123 - I. Androutsopoulos (Software & Knowledge Engineering Lab , Institute ofn Informatics & Telecommunications , NCSR Demokritos 1999

Previous work in the context of natural language querying of temporal databases has established a method to map automatically from a large subset of English time-related questions to suitable expressions of a temporal logic-like language, called TOP. An algorithm to translate from TOP to the TSQL2 temporal database language has also been defined. This paper shows how TOP expressions could be translated into a simpler logic-like language, called BOT. BOT is very close to traditional first-order predicate logic (FOPL), and hence existing methods to manipulate FOPL expressions can be exploited to interface to time-sensitive applications other than TSQL2 databases, maintaining the existing English-to-TOP mapping.

Computation and Language

Analyzing analytical methods: The case of phonology in neural models of spoken language

96 - Grzegorz Chrupa{l}a , Bertrand Higy , Afra Alishahi 2020

Given the fast development of analysis techniques for NLP and speech processing systems, few systematic studies have been conducted to compare the strengths and weaknesses of each method. As a step in this direction we study the case of representations of phonology in neural network models of spoken language. We use two commonly applied analytical techniques, diagnostic classifiers and representational similarity analysis, to quantify to what extent neural activation patterns encode phonemes and phoneme sequences. We manipulate two factors that can affect the outcome of analysis. First, we investigate the role of learning by comparing neural activations extracted from trained versus randomly-initialized models. Second, we examine the temporal scope of the activations by probing both local activations corresponding to a few milliseconds of the speech signal, and global activations pooled over the whole utterance. We conclude that reporting analysis results with randomly initialized models is crucial, and that global-scope methods tend to yield more consistent results and we recommend their use as a complement to local-scope diagnostic methods.

Computation and Language Machine Learning Sound

Learning Dynamic Author Representations with Temporal Language Models

98 - Edouard Delasalles , Sylvain Lamprier , Ludovic Denoyer 2019

Language models are at the heart of numerous works, notably in the text mining and information retrieval communities. These statistical models aim at extracting word distributions, from simple unigram models to recurrent approaches with latent variables that capture subtle dependencies in texts. However, those models are learned from word sequences only, and authors identities, as well as publication dates, are seldom considered. We propose a neural model, based on recurrent language modeling, which aims at capturing language diffusion tendencies in author communities through time. By conditioning language models with author and temporal vector states, we are able to leverage the latent dependencies between the text contexts. This allows us to beat several temporal and non-temporal language baselines on two real-world corpora, and to learn meaningful author representations that vary through time.

Computation and Language Machine Learning Machine Learning

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Analyzing machine-learned representations: A natural language case study

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions