Decoding the Text Encoding

278 0 0.0 ( 0 )

Download Cite

Added by Hamid Izadinia

Publication date 2014

fields Informatics Engineering

and research's language is English

Authors Fereshteh Sadeghi - Hamid Izadinia

Human-Computer Interaction

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Word clouds and text visualization is one of the recent most popular and widely used types of visualizations. Despite the attractiveness and simplicity of producing word clouds, they do not provide a thorough visualization for the distribution of the underlying data. Therefore, it is important to redesign word clouds for improving their design choices and to be able to do further statistical analysis on data. In this paper we have proposed a fully automatic redesigning algorithm for word cloud visualization. Our proposed method is able to decode an input word cloud visualization and provides the raw data in the form of a list of (word, value) pairs. To the best of our knowledge our work is the first attempt to extract raw data from word cloud visualization. We have tested our proposed method both qualitatively and quantitatively. The results of our experiments show that our algorithm is able to extract the words and their weights effectively with considerable low error rate.

rate research

Text Compression-aided Transformer Encoding

163 - Zuchao Li , Zhuosheng Zhang , Hai Zhao 2021

Text encoding is one of the most important steps in Natural Language Processing (NLP). It has been done well by the self-attention mechanism in the current state-of-the-art Transformer encoder, which has brought about significant improvements in the performance of many NLP tasks. Though the Transformer encoder may effectively capture general information in its resulting representations, the backbone information, meaning the gist of the input text, is not specifically focused on. In this paper, we propose explicit and implicit text compression approaches to enhance the Transformer encoding and evaluate models using this approach on several typical downstream tasks that rely on the encoding heavily. Our explicit text compression approaches use dedicated models to compress text, while our implicit text compression approach simply adds an additional module to the main model to handle text compression. We propose three ways of integration, namely backbone source-side fusion, target-side fusion, and both-side fusion, to integrate the backbone information into Transformer-based models for various downstream tasks. Our evaluation on benchmark datasets shows that the proposed explicit and implicit text compression approaches improve results in comparison to strong baselines. We therefore conclude, when comparing the encodings to the baseline models, text compression helps the encoders to learn better language representations.

Computation and Language

DRTS Parsing with Structure-Aware Encoding and Decoding

66 - Qiankun Fu , Yue Zhang , Jiangming Liu 2020

Discourse representation tree structure (DRTS) parsing is a novel semantic parsing task which has been concerned most recently. State-of-the-art performance can be achieved by a neural sequence-to-sequence model, treating the tree construction as an incremental sequence generation problem. Structural information such as input syntax and the intermediate skeleton of the partial output has been ignored in the model, which could be potentially useful for the DRTS parsing. In this work, we propose a structural-aware model at both the encoder and decoder phase to integrate the structural information, where graph attention network (GAT) is exploited for effectively modeling. Experimental results on a benchmark dataset show that our proposed model is effective and can obtain the best performance in the literature.

Computation and Language

On the Relation between Encoding and Decoding of Neuronal Spikes

510 - Shinsuke Koyama 2012

Neural coding is a field of study that concerns how sensory information is represented in the brain by networks of neurons. The link between external stimulus and neural response can be studied from two parallel points of view. The first, neural encoding refers to the mapping from stimulus to response, and primarily focuses on understanding how neurons respond to a wide variety of stimuli, and on constructing models that accurately describe the stimulus-response relationship. Neural decoding, on the other hand, refers to the reverse mapping, from response to stimulus, where the challenge is to reconstruct a stimulus from the spikes it evokes. Since neuronal response is stochastic, a one-to-one mapping of stimuli into neural responses does not exist, causing a mismatch between the two viewpoints of neural coding. Here, we use these two perspectives to investigate the question of what rate coding is, in the simple setting of a single stationary stimulus parameter and a single stationary spike train represented by a renewal process. We show that when rate codes are defined in terms of encoding, i.e., the stimulus parameter is mapped onto the mean firing rate, the rate decoder given by spike counts or the sample mean, does not always efficiently decode the rate codes, but can improve efficiency in reading certain rate codes, when correlations within a spike train are taken into account.

Neurons and Cognition Applications

R2D2: Relational Text Decoding with Transformers

98 - Aryan Arbabi , Mingqiu Wang , Laurent El Shafey 2021

We propose a novel framework for modeling the interaction between graphical structures and the natural language text associated with their nodes and edges. Existing approaches typically fall into two categories. On group ignores the relational structure by converting them into linear sequences and then utilize the highly successful Seq2Seq models. The other side ignores the sequential nature of the text by representing them as fixed-dimensional vectors and apply graph neural networks. Both simplifications lead to information loss. Our proposed method utilizes both the graphical structure as well as the sequential nature of the texts. The input to our model is a set of text segments associated with the nodes and edges of the graph, which are then processed with a transformer encoder-decoder model, equipped with a self-attention mechanism that is aware of the graphical relations between the nodes containing the segments. This also allows us to use BERT-like models that are already trained on large amounts of text. While the proposed model has wide applications, we demonstrate its capabilities on data-to-text generation tasks. Our approach compares favorably against state-of-the-art methods in four tasks without tailoring the model architecture. We also provide an early demonstration in a novel practical application -- generating clinical notes from the medical entities mentioned during clinical visits.

Computation and Language

Data-driven HRF estimation for encoding and decoding models

684 - Fabian Pedregosa , Michael Eickenberg (INRIA Saclay - Ile de France 2014

Despite the common usage of a canonical, data-independent, hemodynamic response function (HRF), it is known that the shape of the HRF varies across brain regions and subjects. This suggests that a data-driven estimation of this function could lead to more statistical power when modeling BOLD fMRI data. However, unconstrained estimation of the HRF can yield highly unstable results when the number of free parameters is large. We develop a method for the joint estimation of activation and HRF using a rank constraint causing the estimated HRF to be equal across events/conditions, yet permitting it to be different across voxels. Model estimation leads to an optimization problem that we propose to solve with an efficient quasi-Newton method exploiting fast gradient computations. This model, called GLM with Rank-1 constraint (R1-GLM), can be extended to the setting of GLM with separate designs which has been shown to improve decoding accuracy in brain activity decoding experiments. We compare 10 different HRF modeling methods in terms of encoding and decoding score in two different datasets. Our results show that the R1-GLM model significantly outperforms competing methods in both encoding and decoding settings, positioning it as an attractive method both from the points of view of accuracy and computational efficiency.

Computational Engineering Machine Learning