أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Zhan Ma

Dynamical Clustering Interrupts Motility Induced Phase Separation in Chiral Active Brownian Particles

207 - Zhan Ma , Ran Ni 2021

Using computer simulations and dynamic mean-field theory, we demonstrate that fast enough rotation of circle active Brownian particles in two dimensions generates a dynamical clustering state interrupting the conventional motility induced phase separ ation (MIPS). Multiple clusters arise from the combination of the conventional MIPS cohesion, and the circulating current caused disintegration. The non-vanishing current in non-equilibrium steady states microscopically originates from the motility ``relieved by automatic rotation, which breaks the detailed balance at the continuum level. This mechanism sheds light on the understanding of dynamic clusters formation observed in a variety of active matter systems, and may help examine the generalization of effective thermodynamic concepts developed in the context of MIPS.

مادة مكثفة ناعمة الميكانيكا الإحصائية الفيزياء البيولوجية

Transfer Learning of Memory Kernels in Coarse-grained Modeling

64 - Zhan Ma , Shu Wang , Minhee Kim 2021

The present work concerns the transferability of coarse-grained (CG) modeling in reproducing the dynamic properties of the reference atomistic systems across a range of parameters. In particular, we focus on implicit-solvent CG modeling of polymer so lutions. The CG model is based on the generalized Langevin equation, where the memory kernel plays the critical role in determining the dynamics in all time scales. Thus, we propose methods for transfer learning of memory kernels. The key ingredient of our methods is Gaussian process regression. By integration with the model order reduction via proper orthogonal decomposition and the active learning technique, the transfer learning can be practically efficient and requires minimum training data. Through two example polymer solution systems, we demonstrate the accuracy and efficiency of the proposed transfer learning methods in the construction of transferable memory kernels. The transferability allows for out-of-sample predictions, even in the extrapolated domain of parameters. Built on the transferable memory kernels, the CG models can reproduce the dynamic properties of polymers in all time scales at different thermodynamic conditions (such as temperature and solvent viscosity) and for different systems with varying concentrations and lengths of polymers.

الهندسة الحاسوبية، المالية،العلوم الفيزياء الحسابية

Data-driven Coarse-grained Modeling of Non-equilibrium Systems

62 - Shu Wang , Zhan Ma , Wenxiao Pan 2021

Modeling a high-dimensional Hamiltonian system in reduced dimensions with respect to coarse-grained (CG) variables can greatly reduce computational cost and enable efficient bottom-up prediction of main features of the system for many applications. H owever, it usually experiences significantly altered dynamics due to loss of degrees of freedom upon coarse-graining. To establish CG models that can faithfully preserve dynamics, previous efforts mainly focused on equilibrium systems. In contrast, various soft matter systems are known out of equilibrium. Therefore, the present work concerns non-equilibrium systems and enables accurate and efficient CG modeling that preserves non-equilibrium dynamics and is generally applicable to any non-equilibrium process and any observable of interest. To this end, the dynamic equation of a CG variable is built in the form of the non-stationary generalized Langevin equation (nsGLE) to account for the dependence of non-equilibrium processes on the initial conditions, where the two-time memory kernel is determined from the data of the two-time auto-correlation function of the non-equilibrium trajectory-averaged observable of interest. By embedding the non-stationary non-Markovian process in an extended stochastic framework, an explicit form of the non-stationary random noise in the nsGLE is introduced, and the cost is significantly reduced for solving the nsGLE to predict the non-equilibrium dynamics of the CG variable. To prove and exploit the equivalence of the nsGLE and extended dynamics, the memory kernel is parameterized in a two-time exponential expansion. A data-driven hybrid optimization process is proposed for the parameterization, a non-convex and high-dimensional optimization problem.

الهندسة الحاسوبية، المالية،العلوم الفيزياء الحسابية

Neural Video Coding using Multiscale Motion Compensation and Spatiotemporal Context Model

127 - Haojie Liu , Ming Lu , Zhan Ma 2020

Over the past two decades, traditional block-based video coding has made remarkable progress and spawned a series of well-known standards such as MPEG-4, H.264/AVC and H.265/HEVC. On the other hand, deep neural networks (DNNs) have shown their powerf ul capacity for visual content understanding, feature extraction and compact representation. Some previous works have explored the learnt video coding algorithms in an end-to-end manner, which show the great potential compared with traditional methods. In this paper, we propose an end-to-end deep neural video coding framework (NVC), which uses variational autoencoders (VAEs) with joint spatial and temporal prior aggregation (PA) to exploit the correlations in intra-frame pixels, inter-frame motions and inter-frame compensation residuals, respectively. Novel features of NVC include: 1) To estimate and compensate motion over a large range of magnitudes, we propose an unsupervised multiscale motion compensation network (MS-MCN) together with a pyramid decoder in the VAE for coding motion features that generates multiscale flow fields, 2) we design a novel adaptive spatiotemporal context model for efficient entropy coding for motion information, 3) we adopt nonlocal attention modules (NLAM) at the bottlenecks of the VAEs for implicit adaptive feature extraction and activation, leveraging its high transformation capacity and unequal weighting with joint global and local information, and 4) we introduce multi-module optimization and a multi-frame training strategy to minimize the temporal error propagation among P-frames. NVC is evaluated for the low-delay causal settings and compared with H.265/HEVC, H.264/AVC and the other learnt video compression methods following the common test conditions, demonstrating consistent gains across all popular test sequences for both PSNR and MS-SSIM distortion metrics.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط

Inferring Point Cloud Quality via Graph Similarity

110 - Qi Yang , Zhan Ma , Yiling Xu 2020

We propose the GraphSIM -- an objective metric to accurately predict the subjective quality of point cloud with superimposed geometry and color impairments. Motivated by the facts that human vision system is more sensitive to the high spatial-frequen cy components (e.g., contours, edges), and weighs more to the local structural variations rather individual point intensity, we first extract geometric keypoints by resampling the reference point cloud geometry information to form the object skeleton; we then construct local graphs centered at these keypoints for both reference and distorted point clouds, followed by collectively aggregating color gradient moments (e.g., zeroth, first, and second) that are derived between all other points and centered keypoint in the same local graph for significant feature similarity (a.k.a., local significance) measurement; Final similarity index is obtained by pooling the local graph significance across all color channels and by averaging across all graphs. Our GraphSIM is validated using two large and independent point cloud assessment datasets that involve a wide range of impairments (e.g., re-sampling, compression, additive noise), reliably demonstrating the state-of-the-art performance for all distortions with noticeable gains in predicting the subjective mean opinion score (MOS), compared with those point-wise distance-based metrics adopted in standardization reference software. Ablation studies have further shown that GraphSIM is generalized to various scenarios with consistent performance by examining its key modules and parameters.

معالجة الصور والفيديو الوسائط المتعددة

Dynamic assembly of active colloids: theory and simulation

62 - Zhan Ma , Mingcheng Yang , 2020

Because of consuming energy to drive their motion, systems of active colloids are intrinsically out of equilibrium. In the past decade, a variety of intriguing dynamic patterns have been observed in systems of active colloids, and they offer a new pl atform for studying non-equilibrium physics, in which computer simulation and analytical theory have played an important role. Here we review the recent progress in understanding the dynamic assembly of active colloids by using numerical and analytical tools. We review the progress in understanding the motility induced phase separation in the past decade, followed by the discussion on the effect of shape anisotropy and hydrodynamics on the dynamic assembly of active colloids.

مادة مكثفة ناعمة

Object-Based Image Coding: A Learning-Driven Revisit

445 - Qi Xia , Haojie Liu , Zhan Ma 2020

The Object-Based Image Coding (OBIC) that was extensively studied about two decades ago, promised a vast application perspective for both ultra-low bitrate communication and high-level semantical content understanding, but it had rarely been used due to the inefficient compact representation of object with arbitrary shape. A fundamental issue behind is how to efficiently process the arbitrary-shaped objects at a fine granularity (e.g., feature element or pixel wise). To attack this, we have proposed to apply the element-wise masking and compression by devising an object segmentation network for image layer decomposition, and parallel convolution-based neural image compression networks to process masked foreground objects and background scene separately. All components are optimized in an end-to-end learning framework to intelligently weigh their (e.g., object and background) contributions for visually pleasant reconstruction. We have conducted comprehensive experiments to evaluate the performance on PASCAL VOC dataset at a very low bitrate scenario (e.g., $lesssim$0.1 bits per pixel - bpp) which have demonstrated noticeable subjective quality improvement compared with JPEG2K, HEVC-based BPG and another learned image compression method. All relevant materials are made publicly accessible at https://njuvision.github.io/Neural-Object-Coding/.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط

Learning End-to-End Lossy Image Compression: A Benchmark

212 - Yueyu Hu , Wenhan Yang , Zhan Ma 2020

Image compression is one of the most fundamental techniques and commonly used applications in the image and video processing field. Earlier methods built a well-designed pipeline, and efforts were made to improve all modules of the pipeline by handcr afted tuning. Later, tremendous contributions were made, especially when data-driven methods revitalized the domain with their excellent modeling capacities and flexibility in incorporating newly designed modules and constraints. Despite great progress, a systematic benchmark and comprehensive analysis of end-to-end learned image compression methods are lacking. In this paper, we first conduct a comprehensive literature survey of learned image compression methods. The literature is organized based on several aspects to jointly optimize the rate-distortion performance with a neural network, i.e., network architecture, entropy model and rate control. We describe milestones in cutting-edge learned image-compression methods, review a broad range of existing works, and provide insights into their historical development routes. With this survey, the main challenges of image compression methods are revealed, along with opportunities to address the related issues with recent advanced learning methods. This analysis provides an opportunity to take a further step towards higher-efficiency image compression. By introducing a coarse-to-fine hyperprior model for entropy estimation and signal reconstruction, we achieve improved rate-distortion performance, especially on high-resolution images. Extensive benchmark experiments demonstrate the superiority of our model in rate-distortion performance and time complexity on multi-core CPUs and GPUs. Our project website is available at https://huzi96.github.io/compression-bench.html.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط

A Dual Camera System for High Spatiotemporal Resolution Video Acquisition

151 - Ming Cheng , Zhan Ma , M. Salman Asif 2019

This paper presents a dual camera system for high spatiotemporal resolution (HSTR) video acquisition, where one camera shoots a video with high spatial resolution and low frame rate (HSR-LFR) and another one captures a low spatial resolution and high frame rate (LSR-HFR) video. Our main goal is to combine videos from LSR-HFR and HSR-LFR cameras to create an HSTR video. We propose an end-to-end learning framework, AWnet, mainly consisting of a FlowNet and a FusionNet that learn an adaptive weighting function in pixel domain to combine inputs in a frame recurrent fashion. To improve the reconstruction quality for cameras used in reality, we also introduce noise regularization under the same framework. Our method has demonstrated noticeable performance gains in terms of both objective PSNR measurement in simulation with different publicly available video and light-field datasets and subjective evaluation with real data captured by dual iPhone 7 and Grasshopper3 cameras. Ablation studies are further conducted to investigate and explore various aspects (such as reference structure, camera parallax, exposure time, etc) of our system to fully understand its capability for potential applications.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد