أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Hao Zhu

M2RNet: Multi-modal and Multi-scale Refined Network for RGB-D Salient Object Detection

153 - Xian Fang , Jinchao Zhu , Ruixun Zhang 2021

Salient object detection is a fundamental topic in computer vision. Previous methods based on RGB-D often suffer from the incompatibility of multi-modal feature fusion and the insufficiency of multi-scale feature aggregation. To tackle these two dile mmas, we propose a novel multi-modal and multi-scale refined network (M2RNet). Three essential components are presented in this network. The nested dual attention module (NDAM) explicitly exploits the combined features of RGB and depth flows. The adjacent interactive aggregation module (AIAM) gradually integrates the neighbor features of high, middle and low levels. The joint hybrid optimization loss (JHOL) makes the predictions have a prominent outline. Extensive experiments demonstrate that our method outperforms other state-of-the-art approaches.

الرؤية الحاسوبية وتمييز الأنماط

The Promise of Dataflow Architectures in the Design of Processing Systems for Autonomous Machines

424 - Shaoshan Liu , Yuhao Zhu , Bo Yu 2021

The commercialization of autonomous machines is a thriving sector, and likely to be the next major computing demand driver, after PC, cloud computing, and mobile computing. Nevertheless, a suitable computer architecture for autonomous machines is mis sing, and many companies are forced to develop ad hoc computing solutions that are neither scalable nor extensible. In this article, we analyze the demands of autonomous machine computing, and argue for the promise of dataflow architectures in autonomous machines.

هندسة العتاد الذكاء الاصطناعي علم الروبوتات

Why Existing Machine Learning Methods Fails At Extracting the Information of Future Returns Out of Historical Sctock Prices : the Curve-Shape-Feature and Non-Curve-Shape-Feature Modes

128 - Jia-Yao Yang , Hao Zhu , Yue-Jie Hou 2021

The financial time series analysis is important access to touch the complex laws of financial markets. Among many goals of the financial time series analysis, one is to construct a model that can extract the information of the future return out of th e known historical stock data, such as stock price, financial news, and e.t.c. To design such a model, prior knowledge on how the future return is correlated with the historical stock prices is needed. In this work, we focus on the issue: in what mode the future return is correlated with the historical stock prices. We manually design several financial time series where the future return is correlated with the historical stock prices in pre-designed modes, namely the curve-shape-feature (CSF) and the non-curve-shape-feature (NCSF) modes. In the CSF mode, the future return can be extracted from the curve shapes of the historical stock prices. By applying various kinds of existing algorithms on those pre-designed time series and real financial time series, we show that: (1) the major information of the future return is not contained in the curve-shape features of historical stock prices. That is, the future return is not mainly correlated with the historical stock prices in the CSF mode. (2) Various kinds of existing machine learning algorithms are good at extracting the curveshape features in the historical stock prices and thus are inappropriate for financial time series analysis although they are successful in the image recognition and natural language processing. That is, new models handling the NCSF series are needed in the financial time series analysis.

الهندسة الحاسوبية، المالية،العلوم

Generative Quantum Learning of Joint Probability Distribution Functions

78 - Elton Yechao Zhu , Sonika Johri , Dave Bacon 2021

Modeling joint probability distributions is an important task in a wide variety of fields. One popular technique for this employs a family of multivariate distributions with uniform marginals called copulas. While the theory of modeling joint distrib utions via copulas is well understood, it gets practically challenging to accurately model real data with many variables. In this work, we design quantum machine learning algorithms to model copulas. We show that any copula can be naturally mapped to a multipartite maximally entangled state. A variational ansatz we christen as a `qopula creates arbitrary correlations between variables while maintaining the copula structure starting from a set of Bell pairs for two variables, or GHZ states for multiple variables. As an application, we train a Quantum Generative Adversarial Network (QGAN) and a Quantum Circuit Born Machine (QCBM) using this variational ansatz to generate samples from joint distributions of two variables for historical data from the stock market. We demonstrate our generative learning algorithms on trapped ion quantum computers from IonQ for up to 8 qubits and show that our results outperform those obtained through equivalent classical generative learning. Further, we present theoretical arguments for exponential advantage in our models expressivity over classical models based on communication and computational complexity arguments.

فيزياء الكم

Flow-Guided Video Inpainting with Scene Templates

92 - Dong Lao , Peihao Zhu , Peter Wonka 2021

We consider the problem of filling in missing spatio-temporal regions of a video. We provide a novel flow-based solution by introducing a generative model of images in relation to the scene (without missing regions) and mappings from the scene to ima ges. We use the model to jointly infer the scene template, a 2D representation of the scene, and the mappings. This ensures consistency of the frame-to-frame flows generated to the underlying scene, reducing geometric distortions in flow based inpainting. The template is mapped to the missing regions in the video by a new L2-L1 interpolation scheme, creating crisp inpaintings and reducing common blur and distortion artifacts. We show on two benchmark datasets that our approach out-performs state-of-the-art quantitatively and in user studies.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي

Lyra: A Benchmark for Turducken-Style Code Generation

143 - Qingyuan Liang , Zeyu Sun , Qihao Zhu 2021

Code generation is crucial to reduce manual software development efforts. Recently, neural techniques have been used to generate source code automatically. While promising, these approaches are evaluated on tasks for generating code in single program ming languages. However, in actual development, one programming language is often embedded in another. For example, SQL statements are often embedded as strings in base programming languages such as Python and Java, and JavaScript programs are often embedded in sever-side programming languages, such as PHP, Java, and Python. We call this a turducken-style programming. In this paper, we define a new code generation task: given a natural language comment, this task aims to generate a program in a base language with an embedded language. To our knowledge, this is the first turducken-style code generation task. For this task, we present Lyra: a dataset in Python with embedded SQL. This dataset contains 2,000 carefully annotated database manipulation programs from real usage projects. Each program is paired with both a Chinese comment and an English comment. In our experiment, we adopted Transformer, a state-of-the-art technique, as the baseline. In the best setting, Transformer achieves 0.5% and 1.5% AST exact matching accuracy using Chinese and English comments, respectively. Therefore, we believe that Lyra provides a new challenge for code generation.

هندسة البرمجيات الذكاء الاصطناعي

Distributionally robust goal-reaching optimization in the presence of background risk

111 - Yichun Chi , Zuo Quan Xu , Sheng Chao Zhuang 2021

In this paper, we examine the effect of background risk on portfolio selection and optimal reinsurance design under the criterion of maximizing the probability of reaching a goal. Following the literature, we adopt dependence uncertainty to model the dependence ambiguity between financial risk (or insurable risk) and background risk. Because the goal-reaching objective function is non-concave, these two problems bring highly unconventional and challenging issues for which classical optimization techniques often fail. Using quantile formulation method, we derive the optimal solutions explicitly. The results show that the presence of background risk does not alter the shape of the solution but instead changes the parameter value of the solution. Finally, numerical examples are given to illustrate the results and verify the robustness of our solutions.

الإحصاء وإدارة المخاطر الاحتمالات الإحصاء والرياضيات المالية

Detailed Avatar Recovery from Single Image

397 - Hao Zhu , Xinxin Zuo , Haotian Yang 2021

This paper presents a novel framework to recover emph{detailed} avatar from a single image. It is a challenging task due to factors such as variations in human shapes, body poses, texture, and viewpoints. Prior methods typically attempt to recover th e human body shape using a parametric-based template that lacks the surface details. As such resulting body shape appears to be without clothing. In this paper, we propose a novel learning-based framework that combines the robustness of the parametric model with the flexibility of free-form 3D deformation. We use the deep neural networks to refine the 3D shape in a Hierarchical Mesh Deformation (HMD) framework, utilizing the constraints from body joints, silhouettes, and per-pixel shading information. Our method can restore detailed human body shapes with complete textures beyond skinned models. Experiments demonstrate that our method has outperformed previous state-of-the-art approaches, achieving better accuracy in terms of both 2D IoU number and 3D metric distance.

الرؤية الحاسوبية وتمييز الأنماط

Adaptive Hierarchical Graph Reasoning with Semantic Coherence for Video-and-Language Inference

141 - Juncheng Li , Siliang Tang , Linchao Zhu 2021

Video-and-Language Inference is a recently proposed task for joint video-and-language understanding. This new task requires a model to draw inference on whether a natural language statement entails or contradicts a given video clip. In this paper, we study how to address three critical challenges for this task: judging the global correctness of the statement involved multiple semantic meanings, joint reasoning over video and subtitles, and modeling long-range relationships and complex social interactions. First, we propose an adaptive hierarchical graph network that achieves in-depth understanding of the video over complex interactions. Specifically, it performs joint reasoning over video and subtitles in three hierarchies, where the graph structure is adaptively adjusted according to the semantic structures of the statement. Secondly, we introduce semantic coherence learning to explicitly encourage the semantic coherence of the adaptive hierarchical graph network from three hierarchies. The semantic coherence learning can further improve the alignment between vision and linguistics, and the coherence across a sequence of video segments. Experimental results show that our method significantly outperforms the baseline by a large margin.

الرؤية الحاسوبية وتمييز الأنماط

Controlled synthesis of MoxW1-xTe2 atomic layers with emergent quantum states

87 - Ya Deng , Peiling Li , Chao Zhu 2021

Recently, new states of matter like superconducting or topological quantum states were found in transition metal dichalcogenides (TMDs) and manifested themselves in a series of exotic physical behaviors. Such phenomena have been demonstrated to exist in a series of transition metal tellurides including MoTe2, WTe2 and alloyed MoxW1-xTe2. However, the behaviors in the alloy system have been rarely addressed due to their difficulty in obtaining atomic layers with controlled composition, albeit the alloy offers a great platform to tune the quantum states. Here, we report a facile CVD method to synthesize the MoxW1-xTe2 with controllable thickness and chemical composition ratios. The atomic structure of monolayer MoxW1-xTe2 alloy was experimentally confirmed by scanning transmission electron microscopy (STEM). Importantly, two different transport behaviors including superconducting and Weyl semimetal (WSM) states were observed in Mo-rich Mo0.8W0.2Te2 and W-rich Mo0.2W0.8Te2 samples respectively. Our results show that the electrical properties of MoxW1-xTe2 can be tuned by controlling the chemical composition, demonstrating our controllable CVD growth method is an efficient strategy to manipulate the physical properties of TMDCs. Meanwhile, it provides a perspective on further comprehension and shed light on the design of device with topological multicomponent TMDCs materials.

علم المواد

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد