Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Discrete Auto-regressive Variational Attention Models for Text Modeling

188 0 0.0 ( 0 )

Download Cite

Added by Fang Xianghong

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Xianghong Fang - Haoli Bai - Jian Li

Machine Learning Computation and Language

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Variational autoencoders (VAEs) have been widely applied for text modeling. In practice, however, they are troubled by two challenges: information underrepresentation and posterior collapse. The former arises as only the last hidden state of LSTM encoder is transformed into the latent space, which is generally insufficient to summarize the data. The latter is a long-standing problem during the training of VAEs as the optimization is trapped to a disastrous local optimum. In this paper, we propose Discrete Auto-regressive Variational Attention Model (DAVAM) to address the challenges. Specifically, we introduce an auto-regressive variational attention approach to enrich the latent space by effectively capturing the semantic dependency from the input. We further design discrete latent space for the variational attention and mathematically show that our model is free from posterior collapse. Extensive experiments on language modeling tasks demonstrate the superiority of DAVAM against several VAE counterparts.

rate research

Direct Optimization through $arg max$ for Discrete Variational Auto-Encoder

586 - Guy Lorberbom 2018

Reparameterization of variational auto-encoders with continuous random variables is an effective method for reducing the variance of their gradient estimates. In the discrete case, one can perform reparametrization using the Gumbel-Max trick, but the resulting objective relies on an $arg max$ operation and is non-differentiable. In contrast to previous works which resort to softmax-based relaxations, we propose to optimize it directly by applying the direct loss minimization approach. Our proposal extends naturally to structured discrete latent variable models when evaluating the $arg max$ operation is tractable. We demonstrate empirically the effectiveness of the direct loss minimization technique in variational autoencoders with both unstructured and structured discrete latent variables.

Machine Learning Machine Learning

Improved Variational Autoencoders for Text Modeling using Dilated Convolutions

161 - Zichao Yang , Zhiting Hu , Ruslan Salakhutdinov 2017

Recent work on generative modeling of text has found that variational auto-encoders (VAE) incorporating LSTM decoders perform worse than simpler LSTM language models (Bowman et al., 2015). This negative result is so far poorly understood, but has been attributed to the propensity of LSTM decoders to ignore conditioning information from the encoder. In this paper, we experiment with a new type of decoder for VAE: a dilated CNN. By changing the decoders dilation architecture, we control the effective context from previously generated words. In experiments, we find that there is a trade off between the contextual capacity of the decoder and the amount of encoding information used. We show that with the right decoder, VAE can outperform LSTM language models. We demonstrate perplexity gains on two datasets, representing the first positive experimental result on the use VAE for generative modeling of text. Further, we conduct an in-depth investigation of the use of VAE (with our new decoding architecture) for semi-supervised and unsupervised labeling tasks, demonstrating gains over several strong baselines.

Neural and Evolutionary Computing Computation and Language Machine Learning

Time-varying auto-regressive models for count time-series

84 - Arkaprava Roy , Sayar Karmakar 2020

Count-valued time series data are routinely collected in many application areas. We are particularly motivated to study the count time series of daily new cases, arising from COVID-19 spread. We propose two Bayesian models, a time-varying semiparametric AR(p) model for count and then a time-varying INGARCH model considering the rapid changes in the spread. We calculate posterior contraction rates of the proposed Bayesian methods with respect to average Hellinger metric. Our proposed structures of the models are amenable to Hamiltonian Monte Carlo (HMC) sampling for efficient computation. We substantiate our methods by simulations that show superiority compared to some of the close existing methods. Finally we analyze the daily time series data of newly confirmed cases to study its spread through different government interventions.

Methodology Statistics Theory Statistics Theory

Improve Diverse Text Generation by Self Labeling Conditional Variational Auto Encoder

301 - Yuchi Zhang , Yongliang Wang , Liping Zhang 2019

Diversity plays a vital role in many text generating applications. In recent years, Conditional Variational Auto Encoders (CVAE) have shown promising performances for this task. However, they often encounter the so called KL-Vanishing problem. Previous works mitigated such problem by heuristic methods such as strengthening the encoder or weakening the decoder while optimizing the CVAE objective function. Nevertheless, the optimizing direction of these methods are implicit and it is hard to find an appropriate degree to which these methods should be applied. In this paper, we propose an explicit optimizing objective to complement the CVAE to directly pull away from KL-vanishing. In fact, this objective term guides the encoder towards the best encoder of the decoder to enhance the expressiveness. A labeling network is introduced to estimate the best encoder. It provides a continuous label in the latent space of CVAE to help build a close connection between latent variables and targets. The whole proposed method is named Self Labeling CVAE~(SLCVAE). To accelerate the research of diverse text generation, we also propose a large native one-to-many dataset. Extensive experiments are conducted on two tasks, which show that our method largely improves the generating diversity while achieving comparable accuracy compared with state-of-art algorithms.

Machine Learning Computation and Language Machine Learning

Probing turbulence intermittency via Auto-Regressive Moving-Average models

552 - Davide Faranda , Flavio Maria Emanuele Pons , Berengere Dubrulle andn Francois Daviaud 2014

We suggest a new approach to probing intermittency corrections to the Kolmogorov law in turbulent flows based on the Auto-Regressive Moving-Average modeling of turbulent time series. We introduce a new index $Upsilon$ that measures the distance from a Kolmogorov-Obukhov model in the Auto-Regressive Moving-Average models space. Applying our analysis to Particle Image Velocimetry and Laser Doppler Velocimetry measurements in a von Karman swirling flow, we show that $Upsilon$ is proportional to the traditional intermittency correction computed from the structure function. Therefore it provides the same information, using much shorter time series. We conclude that $Upsilon$ is a suitable index to reconstruct the spatial intermittency of the dissipation in both numerical and experimental turbulent fields.

Fluid Dynamics

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Discrete Auto-regressive Variational Attention Models for Text Modeling

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions