Do you want to publish a course? Click here

FlowPrior: Learning Expressive Priors for Latent Variable Sentence Models

flowprior: تعلم البشاشة التعبيرية لنماذج الجملة المتغيرة الكامنة

253   0   0   0.0 ( 0 )
 Publication date 2021
and research's language is English
 Created by Shamra Editor




Ask ChatGPT about the research

Variational autoencoders (VAEs) are widely used for latent variable modeling of text. We focus on variations that learn expressive prior distributions over the latent variable. We find that existing training strategies are not effective for learning rich priors, so we propose adding the importance-sampled log marginal likelihood as a second term to the standard VAE objective to help when learning the prior. Doing so improves results for all priors evaluated, including a novel choice for sentence VAEs based on normalizing flows (NF). Priors parameterized with NF are no longer constrained to a specific distribution family, allowing a more flexible way to encode the data distribution. Our model, which we call FlowPrior, shows a substantial improvement in language modeling tasks compared to strong baselines. We demonstrate that FlowPrior learns an expressive prior with analysis and several forms of evaluation involving generation.



References used
https://aclanthology.org/
rate research

Read More

Undirected neural sequence models have achieved performance competitive with the state-of-the-art directed sequence models that generate monotonically from left to right in machine translation tasks. In this work, we train a policy that learns the ge neration order for a pre-trained, undirected translation model via reinforcement learning. We show that the translations decoded by our learned orders achieve higher BLEU scores than the outputs decoded from left to right or decoded by the learned order from Mansimov et al. (2019) on the WMT'14 German-English translation task. On examples with a maximum source and target length of 30 from De-En and WMT'16 English-Romanian tasks, our learned order outperforms all heuristic generation orders on three out of four language pairs. We next carefully analyze the learned order patterns via qualitative and quantitative analysis. We show that our policy generally follows an outer-to-inner order, predicting the left-most and right-most positions first, and then moving toward the middle while skipping less important words at the beginning. Furthermore, the policy usually predicts positions for a single syntactic constituent structure in consecutive steps. We believe our findings could provide more insights on the mechanism of undirected generation models and encourage further research in this direction.
Compositional, structured models are appealing because they explicitly decompose problems and provide interpretable intermediate outputs that give confidence that the model is not simply latching onto data artifacts. Learning these models is challeng ing, however, because end-task supervision only provides a weak indirect signal on what values the latent decisions should take. This often results in the model failing to learn to perform the intermediate tasks correctly. In this work, we introduce a way to leverage paired examples that provide stronger cues for learning latent decisions. When two related training examples share internal substructure, we add an additional training objective to encourage consistency between their latent decisions. Such an objective does not require external supervision for the values of the latent output, or even the end task, yet provides an additional training signal to that provided by individual training examples themselves. We apply our method to improve compositional question answering using neural module networks on the DROP dataset. We explore three ways to acquire paired questions in DROP: (a) discovering naturally occurring paired examples within the dataset, (b) constructing paired examples using templates, and (c) generating paired examples using a question generation model. We empirically demonstrate that our proposed approach improves both in- and out-of-distribution generalization and leads to correct latent decision predictions.
We propose a multi-task, probabilistic approach to facilitate distantly supervised relation extraction by bringing closer the representations of sentences that contain the same Knowledge Base pairs. To achieve this, we bias the latent space of senten ces via a Variational Autoencoder (VAE) that is trained jointly with a relation classifier. The latent code guides the pair representations and influences sentence reconstruction. Experimental results on two datasets created via distant supervision indicate that multi-task learning results in performance benefits. Additional exploration of employing Knowledge Base priors into theVAE reveals that the sentence space can be shifted towards that of the Knowledge Base, offering interpretability and further improving results.
The researcher in the history of ancient Eastern civilizations observes an essential aspect in them, which is the important role played by the belief in those civilizations, as it motivation and directed the foundation to her, and what submitted of m ankind's varied civilized achievements of (political, architectural, artistic, and others)proves that. As the impact of religious belief was apparent regarding the human life, in terms of worship of many gods idealized according to their functions believed in that era, so since human minds were occupied by the prevailing myths and fantasy which also control their thoughts and activities, and that is what the different archaeological findings reflected to us such as the clay dolls which through them, they expressed the female god or the mother god in primitive spontaneous way, so it was the language spoken and expressed their vision and imagination spontaneously away from theories or artistic trends. But when we try to read and analyze them we are surprised by what they included of aesthetic and formative values appear in the beauty of artistic expression, and the way they formed the shape and linked it to the content, As if they were aware of what he was doing, which added high artistic values to its historical value it may be relatively comparable to what was carried out by contemporary academic sculptor.
Explaining neural network models is important for increasing their trustworthiness in real-world applications. Most existing methods generate post-hoc explanations for neural network models by identifying individual feature attributions or detecting interactions between adjacent features. However, for models with text pairs as inputs (e.g., paraphrase identification), existing methods are not sufficient to capture feature interactions between two texts and their simple extension of computing all word-pair interactions between two texts is computationally inefficient. In this work, we propose the Group Mask (GMASK) method to implicitly detect word correlations by grouping correlated words from the input text pair together and measure their contribution to the corresponding NLP tasks as a whole. The proposed method is evaluated with two different model architectures (decomposable attention model and BERT) across four datasets, including natural language inference and paraphrase identification tasks. Experiments show the effectiveness of GMASK in providing faithful explanations to these models.

suggested questions

comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا