Do you want to publish a course? Click here

Abstractive summarization models heavily rely on copy mechanisms, such as the pointer network or attention, to achieve good performance, measured by textual overlap with reference summaries. As a result, the generated summaries stay close to the form ulations in the source document. We propose the *sentence planner* model to generate more abstractive summaries. It includes a hierarchical decoder that first generates a representation for the next summary sentence, and then conditions the word generator on this representation. Our generated summaries are more abstractive and at the same time achieve high ROUGE scores when compared to human reference summaries. We verify the effectiveness of our design decisions with extensive evaluations.
State-of-the-art abstractive summarization models generally rely on extensive labeled data, which lowers their generalization ability on domains where such data are not available. In this paper, we present a study of domain adaptation for the abstrac tive summarization task across six diverse target domains in a low-resource setting. Specifically, we investigate the second phase of pre-training on large-scale generative models under three different settings: 1) source domain pre-training; 2) domain-adaptive pre-training; and 3) task-adaptive pre-training. Experiments show that the effectiveness of pre-training is correlated with the similarity between the pre-training data and the target domain task. Moreover, we find that continuing pre-training could lead to the pre-trained model's catastrophic forgetting, and a learning method with less forgetting can alleviate this issue. Furthermore, results illustrate that a huge gap still exists between the low-resource and high-resource settings, which highlights the need for more advanced domain adaptation methods for the abstractive summarization task.
Abstract We introduce a simple but flexible mechanism to learn an intermediate plan to ground the generation of abstractive summaries. Specifically, we prepend (or prompt) target summaries with entity chains---ordered sequences of entities mentioned in the summary. Transformer-based sequence-to-sequence models are then trained to generate the entity chain and then continue generating the summary conditioned on the entity chain and the input. We experimented with both pretraining and finetuning with this content planning objective. When evaluated on CNN/DailyMail, XSum, SAMSum, and BillSum, we demonstrate empirically that the grounded generation with the planning objective improves entity specificity and planning in summaries for all datasets, and achieves state-of-the-art performance on XSum and SAMSum in terms of rouge. Moreover, we demonstrate empirically that planning with entity chains provides a mechanism to control hallucinations in abstractive summaries. By prompting the decoder with a modified content plan that drops hallucinated entities, we outperform state-of-the-art approaches for faithfulness when evaluated automatically and by humans.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا