أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Qi Yang

Which One is Better: Assessing Objective Metrics for Point Cloud Compression

112 - Yipeng Liu , Qi Yang , Yiling Xu 2021

Point cloud compression (PCC) has made remarkable achievement in recent years. In the mean time, point cloud quality assessment (PCQA) also realize gratifying development. Some recently emerged metrics present robust performance on public point cloud assessment databases. However, these metrics have not been evaluated specifically for PCC to verify whether they exhibit consistent performance with the subjective perception. In this paper, we establish a new dataset for compression evaluation first, which contains 175 compressed point clouds in total, deriving from 7 compression algorithms with 5 compression levels. Then leveraging the proposed dataset, we evaluate the performance of the existing PCQA metrics in terms of different compression types. The results demonstrate some deficiencies of existing metrics in compression evaluation.

معالجة الصور والفيديو

The Three and Fourfold Translative Tiles in Three-Dimensional Space

123 - Mei Han , Kirati Sriamorn , Qi Yang 2021

This paper proves the following statement: If a convex body can form a three or fourfold translative tiling in three-dimensional space, it must be a parallelohedron. In other words, it must be a parallelotope, a hexagonal prism, a rhombic dodecahedro n, an elongated dodecahedron, or a truncated octahedron.

هندسة القياسات

Deep Face Video Inpainting via UV Mapping

130 - Wenqi Yang , Zhenfang Chen , Chaofeng Chen 2021

This paper addresses the problem of face video inpainting. Existing video inpainting methods target primarily at natural scenes with repetitive patterns. They do not make use of any prior knowledge of the face to help retrieve correspondences for the corrupted face. They therefore only achieve sub-optimal results, particularly for faces under large pose and expression variations where face components appear very differently across frames. In this paper, we propose a two-stage deep learning method for face video inpainting. We employ 3DMM as our 3D face prior to transform a face between the image space and the UV (texture) space. In Stage I, we perform face inpainting in the UV space. This helps to largely remove the influence of face poses and expressions and makes the learning task much easier with well aligned face features. We introduce a frame-wise attention module to fully exploit correspondences in neighboring frames to assist the inpainting task. In Stage II, we transform the inpainted face regions back to the image space and perform face video refinement that inpaints any background regions not covered in Stage I and also refines the inpainted face regions. Extensive experiments have been carried out which show our method can significantly outperform methods based merely on 2D information, especially for faces under large pose and expression variations.

الرؤية الحاسوبية وتمييز الأنماط

An empirical representation of a physical model for the ISM [CII], CO, and [CI] emission at redshift $1leq zleq9$

130 - Shengqi Yang , Gergo Popping , Rachel S. Somerville 2021

Sub-millimeter emission lines produced by the interstellar medium (ISM) are strong tracers of star formation and are some of the main targets of line intensity mapping (LIM) surveys. In this work we present an empirical multi-line emission model that simultaneously covers the mean, scatter, and correlations of [CII], CO J=1-0 to J=5-4, and [CI] lines in redshift range $1leq zleq9$. We assume the galaxy ISM line emission luminosity versus halo mass relations can be described by double power laws with redshift-dependent log normal scatter. The model parameters are then derived by fitting to the state of the art semi-analytic simulation results that have successfully reproduced multiple sub-millimeter line observations at $0leq zlesssim6$. We cross check the line emission statistics predicted by the semi-analytic simulation and our empirical model, finding that at $zgeq1$ our model reproduces the simulated line intensities with fractional error less than about 10%. The fractional difference is less than 25% for the power spectra. Grounded on physically-motivated and self-consistent galaxy simulations, this computationally efficient model will be helpful in forecasting ISM emission line statistics for upcoming LIM surveys.

الفيزياء الفلكية من المجرات

Interpolation-Aware Padding for 3D Sparse Convolutional Neural Networks

138 - Yu-Qi Yang , Peng-Shuai Wang , Yang Liu 2021

Sparse voxel-based 3D convolutional neural networks (CNNs) are widely used for various 3D vision tasks. Sparse voxel-based 3D CNNs create sparse non-empty voxels from the 3D input and perform 3D convolution operations on them only. We propose a simpl e yet effective padding scheme --- interpolation-aware padding to pad a few empty voxels adjacent to the non-empty voxels and involve them in the 3D CNN computation so that all neighboring voxels exist when computing point-wise features via the trilinear interpolation. For fine-grained 3D vision tasks where point-wise features are essential, like semantic segmentation and 3D detection, our network achieves higher prediction accuracy than the existing networks using the nearest neighbor interpolation or the normalized trilinear interpolation with the zero-padding or the octree-padding scheme. Through extensive comparisons on various 3D segmentation and detection tasks, we demonstrate the superiority of 3D sparse CNNs with our padding scheme in conjunction with feature interpolation.

الرؤية الحاسوبية وتمييز الأنماط

Reduction of the electroweak correlation in the PDF updating by using the forward-backward asymmetry of Drell-Yan process

90 - Siqi Yang , Yao Fu , Minghui Liu 2021

This article proposes a novel method for unbiased PDF updating by using the forward-backward asymmetry $(A_{FB})$ in the Drell-Yan $pp rightarrow Z/gamma^{*} rightarrow ell^+ell^-$ process. The $A_{FB}$ spectrum, as a function of the dilepton mass, i s not only governed by the electroweak (EW) fundamental parameter, i.e. the weak mixing angle $sin^2 theta_{W}$, but also sensitive to the parton distribution functions (PDFs). When performing simultaneous or iterative fittings for the PDF updating and EW parameter extraction with the same $A_{FB}$, the strong correlations between them may induce large bias into these two sectors. From our studies, it was found that the sensitivity of $A_{FB}$ on $sin^2 theta_{W}$ is dominated by its average value around the $Z$ pole region, while the shape (or gradient) of the $A_{FB}$ spectrum is insensitive to $sin^2 theta_{W}$ but highly sensitive to the PDF modeling. Accordingly, a new observable related to the gradient of the spectrum is defined, and demonstrated to have the capability of significantly reducing the correlation and potential bias between the PDF updating and electroweak measurement. Moreover, the well-defined observable will provide unique information on the sea-valence PDF ratios of the first generation quarks.

فيزياء الطاقة العالية - الظواهر فيزياء الطاقة العالية - التجربة

Learning to Represent Human Motives for Goal-directed Web Browsing

245 - Jyun-Yu Jiang , Chia-Jung Lee , Longqi Yang 2021

Motives or goals are recognized in psychology literature as the most fundamental drive that explains and predicts why people do what they do, including when they browse the web. Although providing enormous value, these higher-ordered goals are often unobserved, and little is known about how to leverage such goals to assist peoples browsing activities. This paper proposes to take a new approach to address this problem, which is fulfilled through a novel neural framework, Goal-directed Web Browsing (GoWeB). We adopt a psychologically-sound taxonomy of higher-ordered goals and learn to build their representations in a structure-preserving manner. Then we incorporate the resulting representations for enhancing the experiences of common activities people perform on the web. Experiments on large-scale data from Microsoft Edge web browser show that GoWeB significantly outperforms competitive baselines for in-session web page recommendation, re-visitation classification, and goal-based web page grouping. A follow-up analysis further characterizes how the variety of human motives can affect the difference observed in human behavioral patterns.

استرجاع المعلومات

Active Reinforcement Learning over MDPs

84 - Qi Yang , Peng Yang , Ke Tang 2021

The past decade has seen the rapid development of Reinforcement Learning, which acquires impressive performance with numerous training resources. However, one of the greatest challenges in RL is generalization efficiency (i.e., generalization perform ance in a unit time). This paper proposes a framework of Active Reinforcement Learning (ARL) over MDPs to improve generalization efficiency in a limited resource by instance selection. Given a number of instances, the algorithm chooses out valuable instances as training sets while training the policy, thereby costing fewer resources. Unlike existing approaches, we attempt to actively select and use training data rather than train on all the given data, thereby costing fewer resources. Furthermore, we introduce a general instance evaluation metrics and selection mechanism into the framework. Experiments results reveal that the proposed framework with Proximal Policy Optimization as policy optimizer can effectively improve generalization efficiency than unselect-ed and unbiased selected methods.

التعلم الآلي

Prior-Guided Multi-View 3D Head Reconstruction

84 - Xueying Wang , Yudong Guo , Zhongqi Yang 2021

Recovering a 3D head model including the complete face and hair regions is still a challenging problem in computer vision and graphics. In this paper, we consider this problem with a few multi-view portrait images as input. Previous multi-view stereo methods, either based on the optimization strategies or deep learning techniques, suffer from low-frequency geometric structures such as unclear head structures and inaccurate reconstruction in hair regions. To tackle this problem, we propose a prior-guided implicit neural rendering network. Specifically, we model the head geometry with a learnable signed distance field (SDF) and optimize it via an implicit differentiable renderer with the guidance of some human head priors, including the facial prior knowledge, head semantic segmentation information and 2D hair orientation maps. The utilization of these priors can improve the reconstruction accuracy and robustness, leading to a high-quality integrated 3D head model. Extensive ablation studies and comparisons with state-of-the-art methods demonstrate that our method could produce high-fidelity 3D head geometries with the guidance of these priors.

الرؤية الحاسوبية وتمييز الأنماط

Visual Odometry with an Event Camera Using Continuous Ray Warping and Volumetric Contrast Maximization

63 - Yifu Wang , Jiaqi Yang , Xin Peng 2021

We present a new solution to tracking and mapping with an event camera. The motion of the camera contains both rotation and translation, and the displacements happen in an arbitrarily structured environment. As a result, the image matching may no lon ger be represented by a low-dimensional homographic warping, thus complicating an application of the commonly used Image of Warped Events (IWE). We introduce a new solution to this problem by performing contrast maximization in 3D. The 3D location of the rays cast for each event is smoothly varied as a function of a continuous-time motion parametrization, and the optimal parameters are found by maximizing the contrast in a volumetric ray density field. Our method thus performs joint optimization over motion and structure. The practical validity of our approach is supported by an application to AGV motion estimation and 3D reconstruction with a single vehicle-mounted event camera. The method approaches the performance obtained with regular cameras, and eventually outperforms in challenging visual conditions.

الرؤية الحاسوبية وتمييز الأنماط

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد