Do you want to publish a course? Click here

The lack of labeled training data for new features is a common problem in rapidly changing real-world dialog systems. As a solution, we propose a multilingual paraphrase generation model that can be used to generate novel utterances for a target feat ure and target language. The generated utterances can be used to augment existing training data to improve intent classification and slot labeling models. We evaluate the quality of generated utterances using intrinsic evaluation metrics and by conducting downstream evaluation experiments with English as the source language and nine different target languages. Our method shows promise across languages, even in a zero-shot setting where no seed data is available.
Dialogue summarization comes with its own peculiar challenges as opposed to news or scientific articles summarization. In this work, we explore four different challenges of the task: handling and differentiating parts of the dialogue belonging to mul tiple speakers, negation understanding, reasoning about the situation, and informal language understanding. Using a pretrained sequence-to-sequence language model, we explore speaker name substitution, negation scope highlighting, multi-task learning with relevant tasks, and pretraining on in-domain data. Our experiments show that our proposed techniques indeed improve summarization performance, outperforming strong baselines.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا