No Arabic abstract
Due to the discrete nature of words, language GANs require to be optimized from rewards provided by discriminator networks, via reinforcement learning methods. This is a much harder setting than for continuous tasks, which enjoy gradient flows from discriminators to generators, usually leading to dramatic learning instabilities. However, we claim that this can be solved by making discriminator and generator networks cooperate to produce output sequences during training. These cooperative outputs, inherently built to obtain higher discrimination scores, not only provide denser rewards for training, but also form a more compact artificial set for discriminator training, hence improving its accuracy and stability. In this paper, we show that our SelfGAN framework, built on this cooperative principle, outperforms Teacher Forcing and obtains state-of-the-art results on two challenging tasks, Summarization and Question Generation.
Wetting transitions have been predicted and observed to occur for various combinations of fluids and surfaces. This paper describes the origin of such transitions, for liquid films on solid surfaces, in terms of the gas-surface interaction potentials V(r), which depend on the specific adsorption system. The transitions of light inert gases and H2 molecules on alkali metal surfaces have been explored extensively and are relatively well understood in terms of the least attractive adsorption interactions in nature. Much less thoroughly investigated are wetting transitions of Hg, water, heavy inert gases and other molecular films. The basic idea is that nonwetting occurs, for energetic reasons, if the adsorption potentials well-depth D is smaller than, or comparable to, the well-depth of the adsorbate-adsorbate mutual interaction. At the wetting temperature, Tw, the transition to wetting occurs, for entropic reasons, when the liquids surface tension is sufficiently small that the free energy cost in forming a thick film is sufficiently compensated by the fluid- surface interaction energy. Guidelines useful for exploring wetting transitions of other systems are analyzed, in terms of generic criteria involving the simple model, which yields results in terms of gas-surface interaction parameters and thermodynamic properties of the bulk adsorbate.
The entrainment transition of coupled random frequency oscillators presents a long-standing problem in nonlinear physics. The onset of entrainment in populations of large but finite size exhibits strong sensitivity to fluctuations in the oscillator density at the synchronizing frequency. This is the source for the unusual values assumed by the correlation size exponent $ u$. Locally coupled oscillators on a $d$-dimensional lattice exhibit two types of frequency entrainment: symmetry-breaking at $d > 4$, and aggregation of compact synchronized domains in three and four dimensions. Various critical properties of the transition are well captured by finite-size scaling relations with simple yet unconventional exponent values.
Different flavors of transfer learning have shown tremendous impact in advancing research and applications of machine learning. In this work we study the use of a specific family of transfer learning, where the target domain is mapped to the source domain. Specifically we map Natural Language Understanding (NLU) problems to QuestionAnswering (QA) problems and we show that in low data regimes this approach offers significant improvements compared to other approaches to NLU. Moreover we show that these gains could be increased through sequential transfer learning across NLU problems from different domains. We show that our approach could reduce the amount of required data for the same performance by up to a factor of 10.
Existing question answering (QA) datasets are created mainly for the application of having AI to be able to answer questions asked by humans. But in educational applications, teachers and parents sometimes may not know what questions they should ask a child that can maximize their language learning results. With a newly released book QA dataset (FairytaleQA), which educational experts labeled on 46 fairytale storybooks for early childhood readers, we developed an automated QA generation model architecture for this novel application. Our model (1) extracts candidate answers from a given storybook passage through carefully designed heuristics based on a pedagogical framework; (2) generates appropriate questions corresponding to each extracted answer using a language model; and, (3) uses another QA model to rank top QA-pairs. Automatic and human evaluations show that our model outperforms baselines. We also demonstrate that our method can help with the scarcity issue of the childrens book QA dataset via data augmentation on 200 unlabeled storybooks.
This paper presents an empirical study of conversational question reformulation (CQR) with sequence-to-sequence architectures and pretrained language models (PLMs). We leverage PLMs to address the strong token-to-token independence assumption made in the common objective, maximum likelihood estimation, for the CQR task. In CQR benchmarks of task-oriented dialogue systems, we evaluate fine-tuned PLMs on the recently-introduced CANARD dataset as an in-domain task and validate the models using data from the TREC 2019 CAsT Track as an out-domain task. Examining a variety of architectures with different numbers of parameters, we demonstrate that the recent text-to-text transfer transformer (T5) achieves the best results both on CANARD and CAsT with fewer parameters, compared to similar transformer architectures.