ﻻ يوجد ملخص باللغة العربية
Submodularity is desirable for a variety of objectives in content selection where the current neural encoder-decoder framework is inadequate. However, it has so far not been explored in the neural encoder-decoder system for text generation. In this work, we define diminishing attentions with submodular functions and in turn, prove the submodularity of the effective neural coverage. The greedy algorithm approximating the solution to the submodular maximization problem is not suited to attention score optimization in auto-regressive generation. Therefore instead of following how submodular function has been widely used, we propose a simplified yet principled solution. The resulting attention module offers an architecturally simple and empirically effective method to improve the coverage of neural text generation. We run experiments on three directed text generation tasks with different levels of recovering rate, across two modalities, three different neural model architectures and two training strategy variations. The results and analyses demonstrate that our method generalizes well across these settings, produces texts of good quality and outperforms state-of-the-art baselines.
We present NUBIA, a methodology to build automatic evaluation metrics for text generation using only machine learning models as core components. A typical NUBIA model is composed of three modules: a neural feature extractor, an aggregator and a calib
Large-scale pretrained language models have led to dramatic improvements in text generation. Impressive performance can be achieved by finetuning only on a small number of instances (few-shot setting). Nonetheless, almost all previous work simply app
Neural models for text generation require a softmax layer with proper token embeddings during the decoding phase. Most existing approaches adopt single point embedding for each token. However, a word may have multiple senses according to different co
Recent advances in maximizing mutual information (MI) between the source and target have demonstrated its effectiveness in text generation. However, previous works paid little attention to modeling the backward network of MI (i.e., dependency from th
Prototype-driven text generation uses non-parametric models that first choose from a library of sentence prototypes and then modify the prototype to generate the output text. While effective, these methods are inefficient at test time as a result of