ﻻ يوجد ملخص باللغة العربية
We introduce a data set called DCH-2, which contains 4,390 real customer-helpdesk dialogues in Chinese and their English translations. DCH-2 also contains dialogue-level annotations and turn-level annotations obtained independently from either 19 or 20 annotators. The data set was built through our effort as organisers of the NTCIR-14 Short Text Conversation and NTCIR-15 Dialogue Evaluation tasks, to help researchers understand what constitutes an effective customer-helpdesk dialogue, and thereby build efficient and helpful helpdesk systems that are available to customers at all times. In addition, DCH-2 may be utilised for other purposes, for example, as a repository for retrieval-based dialogue systems, or as a parallel corpus for machine translation in the helpdesk domain.
Human conversations are complicated and building a human-like dialogue agent is an extremely challenging task. With the rapid development of deep learning techniques, data-driven models become more and more prevalent which need a huge amount of real
Machine translation is highly sensitive to the size and quality of the training data, which has led to an increasing interest in collecting and filtering large parallel corpora. In this paper, we propose a new method for this task based on multilingu
Dialogue management (DM) decides the next action of a dialogue system according to the current dialogue state, and thus plays a central role in task-oriented dialogue systems. Since dialogue management requires to have access to not only local uttera
We present a parallel machine translation training corpus for English and Akuapem Twi of 25,421 sentence pairs. We used a transformer-based translator to generate initial translations in Akuapem Twi, which were later verified and corrected where nece
User Simulators are one of the major tools that enable offline training of task-oriented dialogue systems. For this task the Agenda-Based User Simulator (ABUS) is often used. The ABUS is based on hand-crafted rules and its output is in semantic form.