أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Hao Zhou

Learning from Peers: Transfer Reinforcement Learning for Joint Radio and Cache Resource Allocation in 5G Network Slicing

334 - Hao Zhou , Melike Erol-Kantarci , Vincent Poor 2021

Radio access network (RAN) slicing is an important part of network slicing in 5G. The evolving network architecture requires the orchestration of multiple network resources such as radio and cache resources. In recent years, machine learning (ML) tec hniques have been widely applied for network slicing. However, most existing works do not take advantage of the knowledge transfer capability in ML. In this paper, we propose a transfer reinforcement learning (TRL) scheme for joint radio and cache resources allocation to serve 5G RAN slicing.We first define a hierarchical architecture for the joint resources allocation. Then we propose two TRL algorithms: Q-value transfer reinforcement learning (QTRL) and action selection transfer reinforcement learning (ASTRL). In the proposed schemes, learner agents utilize the expert agents knowledge to improve their performance on target tasks. The proposed algorithms are compared with both the model-free Q-learning and the model-based priority proportional fairness and time-to-live (PPF-TTL) algorithms. Compared with Q-learning, QTRL and ASTRL present 23.9% lower delay for Ultra Reliable Low Latency Communications slice and 41.6% higher throughput for enhanced Mobile Broad Band slice, while achieving significantly faster convergence than Q-learning. Moreover, 40.3% lower URLLC delay and almost twice eMBB throughput are observed with respect to PPF-TTL.

أنظمة وتحكم بنية الشبكات والإنترنت أنظمة وتحكم

A kilonova from an ultra-quick merger of a neutron star binary

358 - Zhi-Ping Jin , Hao Zhou , Stefano Covino 2021

GRB 060505 was the first well-known nearby (at redshift 0.089) hybrid gamma-ray burst that has a duration longer than 2 seconds but without the association of a supernova down to very stringent limits. The prompt $gamma-$ray flash lasting $sim 4$ sec could consist of an intrinsic short burst and its tail emission, but the sizable temporal lag ($sim 0.35$ sec) as well as the environment properties led to the widely-accepted classification of a long duration gamma-ray burst originated from the collapse of a massive star. Here for the $ first$ time we report the convincing evidence for a thermal-like optical radiation component in the spectral energy distribution of the early afterglow emission. In comparison to AT2017gfo, the thermal radiation is $sim 2$ times brighter and the temperature is comparable at similar epochs. The optical decline is much quicker than that in X-rays, which is also at odds with the fireball afterglow model but quite natural for the presence of a blue kilonova. Our finding reveals a neutron star merger origin of the hybrid GRB 060505 and strongly supports the theoretical speculation that some binary neutron stars can merge ultra-quickly (within $sim 1$ Myr) after their formation when the surrounding region is still highly star-forming and the metallicity remains low. Gravitational wave and electromagnetic jointed observations are expected to confirm such scenarios in the near future.

ظاهرة عالية الطاقة الفيزياء الفيزيائية

A modular quantum computer based on a quantum state router

388 - Chao Zhou , Pinlei Lu , Matthieu Praquin 2021

In this work, we present the design of a superconducting, microwave quantum state router which can realize all-to-all couplings among four quantum modules. Each module consists of a single transmon, readout mode, and communication mode coupled to the router. The router design centers on a parametrically driven, Josephson-junction based three-wave mixing element which generates photon exchange among the modules communication modes. We first demonstrate SWAP operations among the four communication modes, with an average full-SWAP time of 760 ns and average inter-module gate fidelity of 0.97, limited by our modes coherences. We also demonstrate photon transfer and pairwise entanglement between the modules qubits, and parallel operation of simultaneous SWAP gates across the router. These results can readily be extended to faster and higher fidelity router operations, as well as scaled to support larger networks of quantum modules.

فيزياء الكم

PR-Net: Preference Reasoning for Personalized Video Highlight Detection

92 - Runnan Chen , Penghao Zhou , Wenzhe Wang 2021

Personalized video highlight detection aims to shorten a long video to interesting moments according to a users preference, which has recently raised the communitys attention. Current methods regard the users history as holistic information to predic t the users preference but negating the inherent diversity of the users interests, resulting in vague preference representation. In this paper, we propose a simple yet efficient preference reasoning framework (PR-Net) to explicitly take the diverse interests into account for frame-level highlight prediction. Specifically, distinct user-specific preferences for each input query frame are produced, presented as the similarity weighted sum of history highlights to the corresponding query frame. Next, distinct comprehensive preferences are formed by the user-specific preferences and a learnable generic preference for more overall highlight measurement. Lastly, the degree of highlight and non-highlight for each query frame is calculated as semantic similarity to its comprehensive and non-highlight preferences, respectively. Besides, to alleviate the ambiguity due to the incomplete annotation, a new bi-directional contrastive loss is proposed to ensure a compact and differentiable metric space. In this way, our method significantly outperforms state-of-the-art methods with a relative improvement of 12% in mean accuracy precision.

الرؤية الحاسوبية وتمييز الأنماط

Evidence for an Algebra of $boldsymbol{G_2}$ Instantons

44 - Michele Del Zotto , Jihwan Oh , Yehao Zhou 2021

In this short note, we present some evidence towards the existence of an algebra of BPS $G_2$ instantons. These are instantonic configurations that govern the partition functions of 7d SYM theories on local $G_2$ holonomy manifolds $mathcal X$. To sh ed light on such structure, we begin investigating the relation with parent 4d $mathcal N=1$ theories obtained by geometric engineering M-theory on $mathcal X$. The main point of this paper is to substantiate the following dream: the holomorphic sector of such theories on multi-centered Taub-NUT spaces gives rise to an algebra whose characters organise the $G_2$ instanton partition function. As a first step towards this program, we argue by string duality that a multitude of geometries $mathcal X$ exist that are dual to well-known 4d SCFTs arising from D3 branes probes of CY cones: all these models are amenable to analysis along the lines suggested by Dijkgraaf, Gukov, Neitzke and Vafa in the context of topological M-theory. Moreover, we discuss an interesting relation to Costellos twisted M-theory, which arises at local patches, and is a key ingredient in identifying the relevant algebras.

الفيزياء عالية الطاقة - النظرية

Physiological-Physical Feature Fusion for Automatic Voice Spoofing Detection

110 - Junxiao Xue , Hao Zhou , Yabo Wang 2021

Speaker verification systems have been used in many production scenarios in recent years. Unfortunately, they are still highly prone to different kinds of spoofing attacks such as voice conversion and speech synthesis, etc. In this paper, we propose a new method base on physiological-physical feature fusion to deal with voice spoofing attacks. This method involves feature extraction, a densely connected convolutional neural network with squeeze and excitation block (SE-DenseNet), multi-scale residual neural network with squeeze and excitation block (SE-Res2Net) and feature fusion strategies. We first pre-trained a convolutional neural network using the speakers voice and face in the video as surveillance signals. It can extract physiological features from speech. Then we use SE-DenseNet and SE-Res2Net to extract physical features. Such a densely connection pattern has high parameter efficiency and squeeze and excitation block can enhance the transmission of the feature. Finally, we integrate the two features into the SE-Densenet to identify the spoofing attacks. Experimental results on the ASVspoof 2019 data set show that our model is effective for voice spoofing detection. In the logical access scenario, our model improves the tandem decision cost function (t-DCF) and equal error rate (EER) scores by 4% and 7%, respectively, compared with other methods. In the physical access scenario, our model improved t-DCF and EER scores by 8% and 10%, respectively.

معالجة الصوت والكلام أنظمة الصوت في الحاسوب معالجة الصور والفيديو

Self-training Improves Pre-training for Few-shot Learning in Task-oriented Dialog Systems

101 - Fei Mi , Wanhao Zhou , Fengyu Cai 2021

As the labeling cost for different modules in task-oriented dialog (ToD) systems is expensive, a major challenge is to train different modules with the least amount of labeled data. Recently, large-scale pre-trained language models, have shown promis ing results for few-shot learning in ToD. In this paper, we devise a self-training approach to utilize the abundant unlabeled dialog data to further improve state-of-the-art pre-trained models in few-shot learning scenarios for ToD systems. Specifically, we propose a self-training approach that iteratively labels the most confident unlabeled data to train a stronger Student model. Moreover, a new text augmentation technique (GradAug) is proposed to better train the Student by replacing non-crucial tokens using a masked language model. We conduct extensive experiments and present analyses on four downstream tasks in ToD, including intent classification, dialog state tracking, dialog act prediction, and response selection. Empirical results demonstrate that the proposed self-training approach consistently improves state-of-the-art pre-trained models (BERT, ToD-BERT) when only a small number of labeled data are available.

الحساب واللغة

DC-GNet: Deep Mesh Relation Capturing Graph Convolution Network for 3D Human Shape Reconstruction

116 - Shihao Zhou , Mengxi Jiang , Shanshan Cai 2021

In this paper, we aim to reconstruct a full 3D human shape from a single image. Previous vertex-level and parameter regression approaches reconstruct 3D human shape based on a pre-defined adjacency matrix to encode positive relations between nodes. T he deep topological relations for the surface of the 3D human body are not carefully exploited. Moreover, the performance of most existing approaches often suffer from domain gap when handling more occlusion cases in real-world scenes. In this work, we propose a Deep Mesh Relation Capturing Graph Convolution Network, DC-GNet, with a shape completion task for 3D human shape reconstruction. Firstly, we propose to capture deep relations within mesh vertices, where an adaptive matrix encoding both positive and negative relations is introduced. Secondly, we propose a shape completion task to learn prior about various kinds of occlusion cases. Our approach encodes mesh structure from more subtle relations between nodes in a more distant region. Furthermore, our shape completion module alleviates the performance degradation issue in the outdoor scene. Extensive experiments on several benchmarks show that our approach outperforms the previous 3D human pose and shape estimation approaches.

الرؤية الحاسوبية وتمييز الأنماط

SLIM: Explicit Slot-Intent Mapping with BERT for Joint Multi-Intent Detection and Slot Filling

221 - Fengyu Cai , Wanhao Zhou , Fei Mi 2021

Utterance-level intent detection and token-level slot filling are two key tasks for natural language understanding (NLU) in task-oriented systems. Most existing approaches assume that only a single intent exists in an utterance. However, there are of ten multiple intents within an utterance in real-life scenarios. In this paper, we propose a multi-intent NLU framework, called SLIM, to jointly learn multi-intent detection and slot filling based on BERT. To fully exploit the existing annotation data and capture the interactions between slots and intents, SLIM introduces an explicit slot-intent classifier to learn the many-to-one mapping between slots and intents. Empirical results on three public multi-intent datasets demonstrate (1) the superior performance of SLIM compared to the current state-of-the-art for NLU with multiple intents and (2) the benefits obtained from the slot-intent classifier.

الذكاء الاصطناعي

Design of a Flying Humanoid Robot Based on Thrust Vector Control

129 - Yuhang Li , Yuhao Zhou , Junbin Huang 2021

Achieving short-distance flight helps improve the efficiency of humanoid robots moving in complex environments (e.g., crossing large obstacles or reaching high places) for rapid emergency missions. This study proposes a design of a flying humanoid ro bot named Jet-HR2. The robot has 10 joints driven by brushless motors and harmonic drives for locomotion. To overcome the challenge of the stable-attitude takeoff in small thrust-to-weight conditions, the robot was designed based on the concept of thrust vectoring. The propulsion system consists of four ducted fans, that is, two fixed on the waist of the robot and the other two mounted on the feet, for thrust vector control. The thrust vector is controlled by adjusting the attitude of the foot during the flight. A simplified model and control strategies are proposed to solve the problem of attitude instability caused by mass errors and joint position errors during takeoff. The experimental results show that the robots spin and dive behaviors during takeoff were effectively suppressed by controlling the thrust vector of the ducted fan on the foot. The robot successfully achieved takeoff at a thrust-to-weight ratio of 1.17 (17 kg / 20 kg) and maintained a stable attitude, reaching a takeoff height of over 1000 mm.

علم الروبوتات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد