أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Yunhao Li

Retrieve & Memorize: Dialog Policy Learning with Multi-Action Memory

80 - Yunhao Li , Yunyi Yang , Xiaojun Quan 2021

Dialogue policy learning, a subtask that determines the content of system response generation and then the degree of task completion, is essential for task-oriented dialogue systems. However, the unbalanced distribution of system actions in dialogue datasets often causes difficulty in learning to generate desired actions and responses. In this paper, we propose a retrieve-and-memorize framework to enhance the learning of system actions. Specially, we first design a neural context-aware retrieval module to retrieve multiple candidate system actions from the training set given a dialogue context. Then, we propose a memory-augmented multi-decoder network to generate the system actions conditioned on the candidate actions, which allows the network to adaptively select key information in the candidate actions and ignore noises. We conduct experiments on the large-scale multi-domain task-oriented dialogue dataset MultiWOZ 2.0 and MultiWOZ 2.1. Experimental results show that our method achieves competitive performance among several state-of-the-art models in the context-to-response generation task.

الحساب واللغة

Joint Weakly Supervised AT and AED Using Deep Feature Distillation and Adaptive Focal Loss

113 - Yunhao Liang , Yanhua Long , Yijie Li 2021

A good joint training framework is very helpful to improve the performances of weakly supervised audio tagging (AT) and acoustic event detection (AED) simultaneously. In this study, we propose three methods to improve the best teacher-student framewo rk of DCASE2019 Task 4 for both AT and AED tasks. A frame-level target-events based deep feature distillation is first proposed, it aims to leverage the potential of limited strong-labeled data in weakly supervised framework to learn better intermediate feature maps. Then we propose an adaptive focal loss and two-stage training strategy to enable an effective and more accurate model training, in which the contribution of difficult-to-classify and easy-to-classify acoustic events to the total cost function can be automatically adjusted. Furthermore, an event-specific post processing is designed to improve the prediction of target event time-stamps. Our experiments are performed on the public DCASE2019 Task4 dataset, and results show that our approach achieves competitive performances in both AT (49.8% F1-score) and AED (81.2% F1-score) tasks.

معالجة الصوت والكلام أنظمة الصوت في الحاسوب

SkinScan: Low-Cost 3D-Scanning for Dermatologic Diagnosis and Documentation

137 - Merlin A. Nau , Florian Schiffers , Yunhao Li 2021

The utilization of computational photography becomes increasingly essential in the medical field. Today, imaging techniques for dermatology range from two-dimensional (2D) color imagery with a mobile device to professional clinical imaging systems me asuring additional detailed three-dimensional (3D) data. The latter are commonly expensive and not accessible to a broad audience. In this work, we propose a novel system and software framework that relies only on low-cost (and even mobile) commodity devices present in every household to measure detailed 3D information of the human skin with a 3D-gradient-illumination-based method. We believe that our system has great potential for early-stage diagnosis and monitoring of skin diseases, especially in vastly populated or underdeveloped areas.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط

UBAR: Towards Fully End-to-End Task-Oriented Dialog Systems with GPT-2

101 - Yunyi Yang , Yunhao Li , Xiaojun Quan 2020

This paper presents our task-oriented dialog system UBAR which models task-oriented dialogs on a dialog session level. Specifically, UBAR is acquired by fine-tuning the large pre-trained unidirectional language model GPT-2 on the sequence of the enti re dialog session which is composed of user utterance, belief state, database result, system act, and system response of every dialog turn. Additionally, UBAR is evaluated in a more realistic setting, where its dialog context has access to user utterances and all content it generated such as belief states, system acts, and system responses. Experimental results on the MultiWOZ datasets show that UBAR achieves state-of-the-art performances in multiple settings, improving the combined score of response generation, policy optimization, and end-to-end modeling by 4.7, 3.5, and 9.4 points respectively. Thorough analyses demonstrate that the session-level training sequence formulation and the generated dialog context are essential for UBAR to operate as a fully end-to-end task-oriented dialog system in real life. We also examine the transfer ability of UBAR to new domains with limited data and provide visualization and a case study to illustrate the advantages of UBAR in modeling on a dialog session level.

الحساب واللغة

In-plane ordering of O vacancies in a high-Tc cuprate superconductor with compressed Cu-O octahedrons: a first-principles cluster expansion study

49 - Yunhao Li , Shiqiao Du , Zheng-Yu Weng 2019

A recently discovered high-Tc cuprate superconductor Ba2CuO$_{4-delta}$ exhibits exceptional Jahn-Teller distortion, wherein the CuO6 octahedrons are compressed along the c axis. As a consequence, the O vacancies prefer to reside in the CuO2 plane, b ut the exact structure is not known. By combining first-principles total energy calculation with the automated structure inversion method, the effective cluster interactions of O vacancies are mapped out. Around $delta$=0.8, where the 73K superconductivity was observed experimentally, we predict that the ordered O vacancies slice the CuO2 plane into not only 1D chains and but also two-leg ladders. A Monte Carlo simulation is performed based on the effective cluster interaction model, showing that such an ordering pattern is stable up to ~900 K. Our results put forth a concrete structural basis to discuss the underlying superconducting mechanism.

المنصة الفائقة علم المواد

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد