Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

CoT: Cooperative Training for Generative Modeling of Discrete Data

81 0 0.0 ( 0 )

Download Cite

Added by Sidi Lu

Publication date 2018

fields Informatics Engineering

and research's language is English

Authors Sidi Lu - Lantao Yu - Siyuan Feng

Machine Learning Artificial Intelligence Computation and Language

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In this paper, we study the generative models of sequential discrete data. To tackle the exposure bias problem inherent in maximum likelihood estimation (MLE), generative adversarial networks (GANs) are introduced to penalize the unrealistic generated samples. To exploit the supervision signal from the discriminator, most previous models leverage REINFORCE to address the non-differentiable problem of sequential discrete data. However, because of the unstable property of the training signal during the dynamic process of adversarial training, the effectiveness of REINFORCE, in this case, is hardly guaranteed. To deal with such a problem, we propose a novel approach called Cooperative Training (CoT) to improve the training of sequence generative models. CoT transforms the min-max game of GANs into a joint maximization framework and manages to explicitly estimate and optimize Jensen-Shannon divergence. Moreover, CoT works without the necessity of pre-training via MLE, which is crucial to the success of previous methods. In the experiments, compared to existing state-of-the-art methods, CoT shows superior or at least competitive performance on sample quality, diversity, as well as training stability.

rate research

Greedy Attack and Gumbel Attack: Generating Adversarial Examples for Discrete Data

64 - Puyudi Yang , Jianbo Chen , Cho-Jui Hsieh 2018

We present a probabilistic framework for studying adversarial attacks on discrete data. Based on this framework, we derive a perturbation-based method, Greedy Attack, and a scalable learning-based method, Gumbel Attack, that illustrate various tradeoffs in the design of attacks. We demonstrate the effectiveness of these methods using both quantitative metrics and human evaluation on various state-of-the-art models for text classification, including a word-based CNN, a character-based CNN and an LSTM. As as example of our results, we show that the accuracy of character-based convolutional networks drops to the level of random selection by modifying only five characters through Greedy Attack.

Machine Learning Artificial Intelligence Computation and Language

Learning Emergent Discrete Message Communication for Cooperative Reinforcement Learning

139 - Sheng Li , Yutai Zhou , Ross Allen 2021

Communication is a important factor that enables agents work cooperatively in multi-agent reinforcement learning (MARL). Most previous work uses continuous message communication whose high representational capacity comes at the expense of interpretability. Allowing agents to learn their own discrete message communication protocol emerged from a variety of domains can increase the interpretability for human designers and other agents.This paper proposes a method to generate discrete messages analogous to human languages, and achieve communication by a broadcast-and-listen mechanism based on self-attention. We show that discrete message communication has performance comparable to continuous message communication but with much a much smaller vocabulary size.Furthermore, we propose an approach that allows humans to interactively send discrete messages to agents.

Machine Learning Artificial Intelligence Multiagent Systems

Analyzing and Improving Generative Adversarial Training for Generative Modeling and Out-of-Distribution Detection

339 - Xuwang Yin , Shiying Li , Gustavo K. Rohde 2020

Generative adversarial training (GAT) is a recently introduced adversarial defense method. Previous works have focused on empirical evaluations of its application to training robust predictive models. In this paper we focus on theoretical understanding of the GAT method and extending its application to generative modeling and out-of-distribution detection. We analyze the optimal solutions of the maximin formulation employed by the GAT objective, and make a comparative analysis of the minimax formulation employed by GANs. We use theoretical analysis and 2D simulations to understand the convergence property of the training algorithm. Based on these results, we develop an incremental generative training algorithm, and conduct comprehensive evaluations of the algorithms application to image generation and adversarial out-of-distribution detection. Our results suggest that generative adversarial training is a promising new direction for the above applications.

Machine Learning Computer Vision and Pattern Recognition Computer Science and Game Theory

Trellis Networks for Sequence Modeling

118 - Shaojie Bai , J. Zico Kolter , Vladlen Koltun 2018

We present trellis networks, a new architecture for sequence modeling. On the one hand, a trellis network is a temporal convolutional network with special structure, characterized by weight tying across depth and direct injection of the input into deep layers. On the other hand, we show that truncated recurrent networks are equivalent to trellis networks with special sparsity structure in their weight matrices. Thus trellis networks with general weight matrices generalize truncated recurrent networks. We leverage these connections to design high-performing trellis networks that absorb structural and algorithmic elements from both recurrent and convolutional models. Experiments demonstrate that trellis networks outperform the current state of the art methods on a variety of challenging benchmarks, including word-level language modeling and character-level language modeling tasks, and stress tests designed to evaluate long-term memory retention. The code is available at https://github.com/locuslab/trellisnet .

Machine Learning Artificial Intelligence Computation and Language

Tomography and Generative Data Modeling via Quantum Boltzmann Training

89 - Maria Kieferova , Nathan Wiebe 2016

The promise of quantum neural nets, which utilize quantum effects to model complex data sets, has made their development an aspirational goal for quantum machine learning and quantum computing in general. Here we provide new methods of training quantum Boltzmann machines, which are a class of recurrent quantum neural network. Our work generalizes existing methods and provides new approaches for training quantum neural networks that compare favorably to existing methods. We further demonstrate that quantum Boltzmann machines enable a form of quantum state tomography that not only estimates a state but provides a perscription for generating copies of the reconstructed state. Classical Boltzmann machines are incapable of this. Finally we compare small non-stoquastic quantum Boltzmann machines to traditional Boltzmann machines for generative tasks and observe evidence that quantum models outperform their classical counterparts.

Quantum Physics

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

CoT: Cooperative Training for Generative Modeling of Discrete Data

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions