أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Ren Yang

Perceptual Learned Video Compression with Recurrent Conditional GAN

154 - Ren Yang , Luc Van Gool , Radu Timofte 2021

This paper proposes a Perceptual Learned Video Compression (PLVC) approach with recurrent conditional generative adversarial network. In our approach, the recurrent auto-encoder-based generator learns to fully explore the temporal correlation for com pressing video. More importantly, we propose a recurrent conditional discriminator, which judges raw and compressed video conditioned on both spatial and temporal information, including the latent representation, temporal motion and hidden states in recurrent cells. This way, in the adversarial training, it pushes the generated video to be not only spatially photo-realistic but also temporally consistent with groundtruth and coherent among video frames. The experimental results show that the proposed PLVC model learns to compress video towards good perceptual quality at low bit-rate, and outperforms the previous traditional and learned approaches on several perceptual quality metrics. The user study further validates the outstanding perceptual performance of PLVC in comparison with the latest learned video compression approaches and the official HEVC test model (HM 16.20). The codes will be released at https://github.com/RenYang-home/PLVC.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط

Curved accretion disks around rotating black holes without reflection symmetry

357 - Che-Yu Chen , Hsiang-Yi Karen Yang 2021

Rotating black holes without equatorial reflection symmetry can naturally arise in effective low-energy theories of fundamental quantum gravity, in particular, when parity-violating interactions are introduced. Adopting a theory-agnostic approach and considering a recently proposed Kerr-like black hole model, we investigate the structure and properties of accretion disk around a rotating black hole without reflection symmetry. In the absence of reflection symmetry, the accretion disk is in general a curved surface in shape, rather than a flat disk lying on the equatorial plane. Furthermore, the parameter $epsilon$ that controls the reflection asymmetry would shrink the size of the innermost stable circular orbits, and enhance the efficiency of the black hole in converting rest-mass energy to radiation during accretion. In addition, we find that spin measurements based on the gravitational redshift observations of the disk, assuming a Kerr geometry, may overestimate the true spin values if the central object is actually a Kerr-like black hole with conspicuous equatorial reflection asymmetry.

النسبية العامة وهدية الكونيات الكم ظاهرة عالية الطاقة الفيزياء الفيزيائية الفيزياء عالية الطاقة - النظرية

Synthesis-guided Adversarial Scenario Generation for Gray-box Feedback Control Systems with Sensing Imperfections

167 - Liren Yang , Necmiye Ozay 2021

In this paper, we study feedback dynamical systems with memoryless controllers under imperfect information. We develop an algorithm that searches for adversarial scenarios, which can be thought of as the strategy for the adversary representing the no ise and disturbances, that lead to safety violations. The main challenge is to analyze the closed-loop systems vulnerabilities with a potentially complex or even unknown controller in the loop. As opposed to commonly adopted approaches that treat the system under test as a black-box, we propose a synthesis-guided approach, which leverages the knowledge of a plant model at hand. This hence leads to a way to deal with gray-box systems (i.e., with known plant and unknown controller). Our approach reveals the role of the imperfect information in the violation. Examples show that our approach can find non-trivial scenarios that are difficult to expose by random simulations. This approach is further extended to incorporate model mismatch and to falsify vision-in-the-loop systems against finite-time reach-avoid specifications.

أنظمة وتحكم أنظمة وتحكم

Scalable Zonotopic Under-approximation of Backward Reachable Sets for Uncertain Linear Systems

93 - Liren Yang , Necmiye Ozay 2021

Zonotopes are widely used for over-approximating forward reachable sets of uncertain linear systems. In this paper, we use zonotopes to achieve more scalable algorithms that under-approximate backward reachable sets for uncertain linear systems. The main difference is that the backward reachability analysis is a two-player game and involves Minkowski difference operations, but zonotopes are not closed under such operations. We under-approximate this Minkowski difference with a zonotope, which can be obtained by solving a linear optimization problem. We further develop an efficient zonotope order reduction technique to bound the complexity of the obtained zonotopic under-approximations. The proposed approach is evaluated against existing approaches using randomly generated instances, and illustrated with an aircraft position control system.

أنظمة وتحكم أنظمة وتحكم

R2RNet: Low-light Image Enhancement via Real-low to Real-normal Network

188 - Jiang Hai , Zhu Xuan , Ren Yang 2021

Images captured in weak illumination conditions will seriously degrade the image quality. Solving a series of degradation of low-light images can effectively improve the visual quality of the image and the performance of high-level visual tasks. In t his paper, we propose a novel Real-low to Real-normal Network for low-light image enhancement, dubbed R2RNet, based on the Retinex theory, which includes three subnets: a Decom-Net, a Denoise-Net, and a Relight-Net. These three subnets are used for decomposing, denoising, and contrast enhancement, respectively. Unlike most previous methods trained on synthetic images, we collect the first Large-Scale Real-World paired low/normal-light images dataset (LSRW dataset) for training. Our method can properly improve the contrast and suppress noise simultaneously. Extensive experiments on publicly available datasets demonstrate that our method outperforms the existing state-of-the-art methods by a large margin both quantitatively and visually. And we also show that the performance of the high-level visual task (emph{i.e.} face detection) can be effectively improved by using the enhanced results obtained by our method in low-light conditions. Our codes and the LSRW dataset are available at: https://github.com/abcdef2000/R2RNet.

الرؤية الحاسوبية وتمييز الأنماط معالجة الصور والفيديو

NTIRE 2021 Challenge on Quality Enhancement of Compressed Video: Methods and Results

364 - Ren Yang , Radu Timofte , Jing Liu 2021

This paper reviews the first NTIRE challenge on quality enhancement of compressed video, with a focus on the proposed methods and results. In this challenge, the new Large-scale Diverse Video (LDV) dataset is employed. The challenge has three tracks. Tracks 1 and 2 aim at enhancing the videos compressed by HEVC at a fixed QP, while Track 3 is designed for enhancing the videos compressed by x265 at a fixed bit-rate. Besides, the quality enhancement of Tracks 1 and 3 targets at improving the fidelity (PSNR), and Track 2 targets at enhancing the perceptual quality. The three tracks totally attract 482 registrations. In the test phase, 12 teams, 8 teams and 11 teams submitted the final results of Tracks 1, 2 and 3, respectively. The proposed methods and solutions gauge the state-of-the-art of video quality enhancement. The homepage of the challenge: https://github.com/RenYang-home/NTIRE21_VEnh

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط

A turbulence model based on deep neural network considering the near-wall effect

85 - Muyuan Liu , Yiren Yang , Hao Chen 2021

There exists continuous demand of improved turbulence models for the closure of Reynolds Averaged Navier-Stokes (RANS) simulations. Machine Learning (ML) offers effective tools for establishing advanced empirical Reynolds stress closures on the basis of high fidelity simulation data. This paper presents a turbulence model based on the Deep Neural Network(DNN) which takes into account the non-linear relationship between the Reynolds stress anisotropy tensor and the local mean velocity gradient as well as the near-wall effect. The construction and the tuning of the DNN-turbulence model are detailed. We show that the DNN-turbulence model trained on data from direct numerical simulations yields an accurate prediction of the Reynolds stresses for plane channel flow. In particular, we propose including the local turbulence Reynolds number in the model input.

ديناميات السوائل

Incorporating planning intelligence into deep learning: A planning support tool for street network design

88 - Zhou Fang , Ying Jin , Tianren Yang 2020

Deep learning applications in shaping ad hoc planning proposals are limited by the difficulty in integrating professional knowledge about cities with artificial intelligence. We propose a novel, complementary use of deep neural networks and planning guidance to automate street network generation that can be context-aware, example-based and user-guided. The model tests suggest that the incorporation of planning knowledge (e.g., road junctions and neighborhood types) in the model training leads to a more realistic prediction of street configurations. Furthermore, the new tool provides both professional and lay users an opportunity to systematically and intuitively explore benchmark proposals for comparisons and further evaluations.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي

DeepStreet: A deep learning powered urban street network generation module

86 - Zhou Fang , Tianren Yang , Ying Jin 2020

In countries experiencing unprecedented waves of urbanization, there is a need for rapid and high quality urban street design. Our study presents a novel deep learning powered approach, DeepStreet (DS), for automatic street network generation that ca n be applied to the urban street design with local characteristics. DS is driven by a Convolutional Neural Network (CNN) that enables the interpolation of streets based on the areas of immediate vicinity. Specifically, the CNN is firstly trained to detect, recognize and capture the local features as well as the patterns of the existing street network sourced from the OpenStreetMap. With the trained CNN, DS is able to predict street networks future expansion patterns within the predefined region conditioned on its surrounding street networks. To test the performance of DS, we apply it to an area in and around the Eixample area in the City of Barcelona, a well known example in the fields of urban and transport planning with iconic grid like street networks in the centre and irregular road alignments farther afield. The results show that DS can (1) detect and self cluster different types of complex street patterns in Barcelona; (2) predict both gridiron and irregular street and road networks. DS proves to have a great potential as a novel tool for designers to efficiently design the urban street network that well maintains the consistency across the existing and newly generated urban street network. Furthermore, the generated networks can serve as a benchmark to guide the local plan-making especially in rapidly developing cities.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي

Learning for Video Compression with Recurrent Auto-Encoder and Recurrent Probability Model

196 - Ren Yang , Fabian Mentzer , Luc Van Gool 2020

The past few years have witnessed increasing interests in applying deep learning to video compression. However, the existing approaches compress a video frame with only a few number of reference frames, which limits their ability to fully exploit the temporal correlation among video frames. To overcome this shortcoming, this paper proposes a Recurrent Learned Video Compression (RLVC) approach with the Recurrent Auto-Encoder (RAE) and Recurrent Probability Model (RPM). Specifically, the RAE employs recurrent cells in both the encoder and decoder. As such, the temporal information in a large range of frames can be used for generating latent representations and reconstructing compressed outputs. Furthermore, the proposed RPM network recurrently estimates the Probability Mass Function (PMF) of the latent representation, conditioned on the distribution of previous latent representations. Due to the correlation among consecutive frames, the conditional cross entropy can be lower than the independent cross entropy, thus reducing the bit-rate. The experiments show that our approach achieves the state-of-the-art learned video compression performance in terms of both PSNR and MS-SSIM. Moreover, our approach outperforms the default Low-Delay P (LDP) setting of x265 on PSNR, and also has better performance on MS-SSIM than the SSIM-tuned x265 and the slowest setting of x265. The codes are available at https://github.com/RenYang-home/RLVC.git.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد