أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Ping Liu

Machine learning for secure key rate in continuous-variable quantum key distribution

146 - Min-Gang Zhou , Zhi-Ping Liu , Wen-Bo Liu 2021

Continuous-variable quantum key distribution (CV-QKD) with discrete modulation has received widespread attentions because of its experimental simplicity, lower-cost implementation and ease to multiplex with classical optical communication. Recently, some inspiring numerical methods have been applied to analyse the security of discrete-modulated CV-QKD against collective attacks, which promises to obtain considerable key rate over one hundred kilometers of fiber distance. However, numerical methods require up to ten minutes to calculate a secure key rate one time using a high-performance personal computer, which means that extracting the real-time secure key rate is impossible for discrete-modulated CV-QKD system. Here, we present a neural network model to quickly predict the secure key rate of homodyne detection discrete-modulated CV-QKD with good accuracy based on experimental parameters and experimental results. With the excess noise of about $0.01$, the speed of our method is improved by about seven orders of magnitude compared to that of the conventional numerical method. Our method can be extended to quickly solve complex security key rate calculation of a variety of other unstructured quantum key distribution protocols.

فيزياء الكم

HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor

162 - Huayao Liu , Ruiping Liu , Kailun Yang 2021

Independently exploring unknown spaces or finding objects in an indoor environment is a daily but challenging task for visually impaired people. However, common 2D assistive systems lack depth relationships between various objects, resulting in diffi culty to obtain accurate spatial layout and relative positions of objects. To tackle these issues, we propose HIDA, a lightweight assistive system based on 3D point cloud instance segmentation with a solid-state LiDAR sensor, for holistic indoor detection and avoidance. Our entire system consists of three hardware components, two interactive functions~(obstacle avoidance and object finding) and a voice user interface. Based on voice guidance, the point cloud from the most recent state of the changing indoor environment is captured through an on-site scanning performed by the user. In addition, we design a point cloud segmentation model with dual lightweight decoders for semantic and offset predictions, which satisfies the efficiency of the whole system. After the 3D instance segmentation, we post-process the segmented point cloud by removing outliers and projecting all points onto a top-view 2D map representation. The system integrates the information above and interacts with users intuitively by acoustic feedback. The proposed 3D instance segmentation model has achieved state-of-the-art performance on ScanNet v2 dataset. Comprehensive field tests with various tasks in a user study verify the usability and effectiveness of our system for assisting visually impaired people in holistic indoor understanding, obstacle avoidance and object search.

الرؤية الحاسوبية وتمييز الأنماط تفاعل الإنسان والحاسوب علم الروبوتات

A novel multimodal fusion network based on a joint coding model for lane line segmentation

81 - Zhenhong Zou , Xinyu Zhang , Huaping Liu 2021

There has recently been growing interest in utilizing multimodal sensors to achieve robust lane line segmentation. In this paper, we introduce a novel multimodal fusion architecture from an information theory perspective, and demonstrate its practica l utility using Light Detection and Ranging (LiDAR) camera fusion networks. In particular, we develop, for the first time, a multimodal fusion network as a joint coding model, where each single node, layer, and pipeline is represented as a channel. The forward propagation is thus equal to the information transmission in the channels. Then, we can qualitatively and quantitatively analyze the effect of different fusion approaches. We argue the optimal fusion architecture is related to the essential capacity and its allocation based on the source and channel. To test this multimodal fusion hypothesis, we progressively determine a series of multimodal models based on the proposed fusion methods and evaluate them on the KITTI and the A2D2 datasets. Our optimal fusion network achieves 85%+ lane line accuracy and 98.7%+ overall. The performance gap among the models will inform continuing future research into development of optimal fusion algorithms for the deep multimodal learning community.

الرؤية الحاسوبية وتمييز الأنماط

Face Images as Jigsaw Puzzles: Compositional Perception of Human Faces for Machines Using Generative Adversarial Networks

76 - Mahla Abdolahnejad , Peter Xiaoping Liu 2021

An important goal in human-robot-interaction (HRI) is for machines to achieve a close to human level of face perception. One of the important differences between machine learning and human intelligence is the lack of compositionality. This paper intr oduces a new scheme to enable generative adversarial networks to learn the distribution of face images composed of smaller parts. This results in a more flexible machine face perception and easier generalization to outside training examples. We demonstrate that this model is able to produce realistic high-quality face images by generating and piecing together the parts. Additionally, we demonstrate that this model learns the relations between the facial parts and their distributions. Therefore, the specific facial parts are interchangeable between generated face images.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Generative Transition Mechanism to Image-to-Image Translation via Encoded Transformation

78 - Yaxin Shi , Xiaowei Zhou , Ping Liu 2021

In this paper, we revisit the Image-to-Image (I2I) translation problem with transition consistency, namely the consistency defined on the conditional data mapping between each data pairs. Explicitly parameterizing each data mappings with a transition variable $t$, i.e., $x overset{t(x,y)}{mapsto}y$, we discover that existing I2I translation models mainly focus on maintaining consistency on results, e.g., image reconstruction or attribute prediction, named result consistency in our paper. This restricts their generalization ability to generate satisfactory results with unseen transitions in the test phase. Consequently, we propose to enforce both result consistency and transition consistency for I2I translation, to benefit the problem with a closer consistency between the input and output. To benefit the generalization ability of the translation model, we propose transition encoding to facilitate explicit regularization of these two {kinds} of consistencies on unseen transitions. We further generalize such explicitly regularized consistencies to distribution-level, thus facilitating a generalized overall consistency for I2I translation problems. With the above design, our proposed model, named Transition Encoding GAN (TEGAN), can poss superb generalization ability to generate realistic and semantically consistent translation results with unseen transitions in the test phase. It also provides a unified understanding of the existing GAN-based I2I transition models with our explicitly modeling of the data mapping, i.e., transition. Experiments on four different I2I translation tasks demonstrate the efficacy and generality of TEGAN.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Observer study-based evaluation of a stochastic and physics-based method to generate oncological PET images

94 - Ziping Liu , Richard Laforest , Joyce Mhlanga 2021

Objective evaluation of new and improved methods for PET imaging requires access to images with ground truth, as can be obtained through simulation studies. However, for these studies to be clinically relevant, it is important that the simulated imag es are clinically realistic. In this study, we develop a stochastic and physics-based method to generate realistic oncological two-dimensional (2-D) PET images, where the ground-truth tumor properties are known. The developed method extends upon a previously proposed approach. The approach captures the observed variabilities in tumor properties from actual patient population. Further, we extend that approach to model intra-tumor heterogeneity using a lumpy object model. To quantitatively evaluate the clinical realism of the simulated images, we conducted a human-observer study. This was a two-alternative forced-choice (2AFC) study with trained readers (five PET physicians and one PET physicist). Our results showed that the readers had an average of ~ 50% accuracy in the 2AFC study. Further, the developed simulation method was able to generate wide varieties of clinically observed tumor types. These results provide evidence for the application of this method to 2-D PET imaging applications, and motivate development of this method to generate 3-D PET images.

الفيزياء الطبية معالجة الصور والفيديو

Fully automated 3D segmentation of dopamine transporter SPECT images using an estimation-based approach

119 - Ziping Liu , Hae Sol Moon , Richard Laforest 2021

Quantitative measures of uptake in caudate, putamen, and globus pallidus in dopamine transporter (DaT) brain SPECT have potential as biomarkers for the severity of Parkinson disease. Reliable quantification of uptake requires accurate segmentation of these regions. However, segmentation is challenging in DaT SPECT due to partial-volume effects, system noise, physiological variability, and the small size of these regions. To address these challenges, we propose an estimation-based approach to segmentation. This approach estimates the posterior mean of the fractional volume occupied by caudate, putamen, and globus pallidus within each voxel of a 3D SPECT image. The estimate is obtained by minimizing a cost function based on the binary cross-entropy loss between the true and estimated fractional volumes over a population of SPECT images, where the distribution of the true fractional volumes is obtained from magnetic resonance images from clinical populations. The proposed method accounts for both the sources of partial-volume effects in SPECT, namely the limited system resolution and tissue-fraction effects. The method was implemented using an encoder-decoder network and evaluated using realistic clinically guided SPECT simulation studies, where the ground-truth fractional volumes were known. The method significantly outperformed all other considered segmentation methods and yielded accurate segmentation with dice similarity coefficients of ~ 0.80 for all regions. The method was relatively insensitive to changes in voxel size. Further, the method was relatively robust up to +/- 10 degrees of patient head tilt along transaxial, sagittal, and coronal planes. Overall, the results demonstrate the efficacy of the proposed method to yield accurate fully automated segmentation of caudate, putamen, and globus pallidus in 3D DaT-SPECT images.

الفيزياء الطبية معالجة الصور والفيديو

Non-line-of-Sight Imaging via Neural Transient Fields

130 - Siyuan Shen , Zi Wang , Ping Liu 2021

We present a neural modeling framework for Non-Line-of-Sight (NLOS) imaging. Previous solutions have sought to explicitly recover the 3D geometry (e.g., as point clouds) or voxel density (e.g., within a pre-defined volume) of the hidden scene. In con trast, inspired by the recent Neural Radiance Field (NeRF) approach, we use a multi-layer perceptron (MLP) to represent the neural transient field or NeTF. However, NeTF measures the transient over spherical wavefronts rather than the radiance along lines. We therefore formulate a spherical volume NeTF reconstruction pipeline, applicable to both confocal and non-confocal setups. Compared with NeRF, NeTF samples a much sparser set of viewpoints (scanning spots) and the sampling is highly uneven. We thus introduce a Monte Carlo technique to improve the robustness in the reconstruction. Comprehensive experiments on synthetic and real datasets demonstrate NeTF provides higher quality reconstruction and preserves fine details largely missing in the state-of-the-art.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط

Understanding the Behaviour of Contrastive Loss

94 - Feng Wang , Huaping Liu 2020

Unsupervised contrastive learning has achieved outstanding success, while the mechanism of contrastive loss has been less studied. In this paper, we concentrate on the understanding of the behaviours of unsupervised contrastive loss. We will show tha t the contrastive loss is a hardness-aware loss function, and the temperature {tau} controls the strength of penalties on hard negative samples. The previous study has shown that uniformity is a key property of contrastive learning. We build relations between the uniformity and the temperature {tau} . We will show that uniformity helps the contrastive learning to learn separable features, however excessive pursuit to the uniformity makes the contrastive loss not tolerant to semantically similar samples, which may break the underlying semantic structure and be harmful to the formation of features useful for downstream tasks. This is caused by the inherent defect of the instance discrimination objective. Specifically, instance discrimination objective tries to push all different instances apart, ignoring the underlying relations between samples. Pushing semantically consistent samples apart has no positive effect for acquiring a prior informative to general downstream tasks. A well-designed contrastive loss should have some extents of tolerance to the closeness of semantically similar samples. Therefore, we find that the contrastive loss meets a uniformity-tolerance dilemma, and a good choice of temperature can compromise these two properties properly to both learn separable features and tolerant to semantically similar samples, improving the feature qualities and the downstream performances.

التعلم الآلي

Shannon Entropy for Time-Varying Persistence of Cell Migration

86 - Yanping Liu 2020

Cell migration, which can be significantly affected by intracellular signaling pathways (ICSP) and extracellular matrix (ECM), plays a crucial role in many physiological and pathological processes. The efficiency of cell migration, which is typically modeled as a persistent random walk (PRW), depends on two critical motility parameters, i.e., migration speed and persistence. It is generally very challenging to efficiently and accurately extract these key dynamics parameters from noisy experimental data. Here, we employ the normalized Shannon entropy to quantify the deviation of cell migration dynamics from that of diffusive/ballistic motion as well as to derive the persistence of cell migration based on the Fourier power spectrum of migration velocities. Moreover, we introduce the time-varying Shannon entropy based on the wavelet power spectrum of cellular dynamics and demonstrate its superior utility to characterize the time-dependent persistence of cell migration, which is typically resulted from complex and time-varying intra or extra-cellular mechanisms. We employ our approach to analyze trajectory data of in vitro cell migration regulated by distinct intracellular and extracellular mechanisms, exhibiting a rich spectrum of dynamic characteristics. Our analysis indicates that the combination of Shannon entropy and wavelet transform offers a simple and efficient tool to estimate the persistence of cell migration, which may also reflect the real-time effects of ICSP-ECM to some extent.

الفيزياء البيولوجية

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد