Research papers, master and doctoral theses published by Li Yu

Check-based generation of one-time tables using qutrits

325 - Li Yu , Xue-Tong Zhang , Chui-Ping Yang 2021

One-time tables are a class of two-party correlations that can help achieve information-theoretically secure two-party (interactive) classical or quantum computation. In this work we propose a bipartite quantum protocol for generating a simple type of one-time tables (the correlation in the Popescu-Rohrlich nonlocal box) with partial security. We then show that by running many instances of the first protocol and performing checks on some of them, asymptotically information-theoretically secure generation of one-time tables can be achieved. The first protocol is adapted from a protocol for semi-honest oblivious transfer, with some changes so that no entangled state needs to be prepared, and the communication involves only one qutrit in each direction. We show that some information tradeoffs in the first protocol are similar to that in the semi-honest oblivious transfer protocol. We also obtain two types of inequalities about guessing probabilities in some protocols for generating one-time tables, from a single type of inequality about guessing probabilities in semi-honest oblivious transfer protocols.

Quantum Physics

PnP-DETR: Towards Efficient Visual Analysis with Transformers

117 - Tao Wang , Li Yuan , Yunpeng Chen 2021

Recently, DETR pioneered the solution of vision tasks with transformers, it directly translates the image feature map into the object detection result. Though effective, translating the full feature map can be costly due to redundant computation on some area like the background. In this work, we encapsulate the idea of reducing spatial redundancy into a novel poll and pool (PnP) sampling module, with which we build an end-to-end PnP-DETR architecture that adaptively allocates its computation spatially to be more efficient. Concretely, the PnP module abstracts the image feature map into fine foreground object feature vectors and a small number of coarse background contextual feature vectors. The transformer models information interaction within the fine-coarse feature space and translates the features into the detection result. Moreover, the PnP-augmented model can instantly achieve various desired trade-offs between performance and computation with a single model by varying the sampled feature length, without requiring to train multiple models as existing methods. Thus it offers greater flexibility for deployment in diverse scenarios with varying computation constraint. We further validate the generalizability of the PnP module on panoptic segmentation and the recent transformer-based image recognition model ViT and show consistent efficiency gain. We believe our method makes a step for efficient visual analysis with transformers, wherein spatial redundancy is commonly observed. Code will be available at url{https://github.com/twangnh/pnp-detr}.

Computer Vision and Pattern Recognition

A Matching Algorithm based on Image Attribute Transfer and Local Features for Underwater Acoustic and Optical Images

60 - Xiaoteng Zhou , Changli Yu , Xin Yuan 2021

In the field of underwater vision research, image matching between the sonar sensors and optical cameras has always been a challenging problem. Due to the difference in the imaging mechanism between them, which are the gray value, texture, contrast, etc. of the acoustic images and the optical images are also variant in local locations, which makes the traditional matching method based on the optical image invalid. Coupled with the difficulties and high costs of underwater data acquisition, it further affects the research process of acousto-optic data fusion technology. In order to maximize the use of underwater sensor data and promote the development of multi-sensor information fusion (MSIF), this study applies the image attribute transfer method based on deep learning approach to solve the problem of acousto-optic image matching, the core of which is to eliminate the imaging differences between them as much as possible. At the same time, the advanced local feature descriptor is introduced to solve the challenging acousto-optic matching problem. Experimental results show that our proposed method could preprocess acousto-optic images effectively and obtain accurate matching results. Additionally, the method is based on the combination of image depth semantic layer, and it could indirectly display the local feature matching relationship between original image pair, which provides a new solution to the underwater multi-sensor image matching problem.

Computer Vision and Pattern Recognition

Deep Denoising Method for Side Scan Sonar Images without High-quality Reference Data

110 - Xiaoteng Zhou , Changli Yu , Xin Yuan 2021

Subsea images measured by the side scan sonars (SSSs) are necessary visual data in the process of deep-sea exploration by using the autonomous underwater vehicles (AUVs). They could vividly reflect the topography of the seabed, but usually accompanied by complex and severe noise. This paper proposes a deep denoising method for SSS images without high-quality reference data, which uses one single noise SSS image to perform self-supervised denoising. Compared with the classical artificially designed filters, the deep denoising method shows obvious advantages. The denoising experiments are performed on the real seabed SSS images, and the results demonstrate that our proposed method could effectively reduce the noise on the SSS image while minimizing the image quality and detail loss.

Image and Video Processing Computer Vision and Pattern Recognition

Matching Underwater Sonar Images by the Learned Descriptor Based on Style Transfer Method

59 - Xiaoteng Zhou , Changli Yu , Xin Yuan 2021

This paper proposes a method that combines the style transfer technique and the learned descriptor to enhance the matching performances of underwater sonar images. In the field of underwater vision, sonar is currently the most effective long-distance detection sensor, it has excellent performances in map building and target search tasks. However, the traditional image matching algorithms are all developed based on optical images. In order to solve this contradiction, the style transfer method is used to convert the sonar images into optical styles, and at the same time, the learned descriptor with excellent expressiveness for sonar images matching is introduced. Experiments show that this method significantly enhances the matching quality of sonar images. In addition, it also provides new ideas for the preprocessing of underwater sonar images by using the style transfer approach.

Computer Vision and Pattern Recognition

Design and evaluation of a FPGA-ADC prototype for the PET detector based on LYSO Crystals and SiPM arrays

190 - Cong Ma , Xue Dong , Li Yu 2021

The aim of this study is to design and evaluate a simple free running Analog-Digital Converter (ADC) based on the Field Programmable Gate Array (FPGA) device to accomplish the energy and position readout of the silicon photomultiplier (SiPM) array for application as PET scanners. This simple FPGA-ADC based on a carry chain Time-Digital Converter (TDC) implemented on a Kintex-7 FPGA consists of only one off-chip resistor so it has greater advantages in improving system integration and reducing cost than commercial chips. In this paper, a FPGA-ADC based front-end electronics prototype is presented, and both the design principle and implementation considerations are discussed. Experiments were performed using an 8 x 8 (crystal size: 4 x 4 x 15 mm3 ) and a 12 x 12 (crystal size: 2.65 x2.65 x 15 mm3 ) segmented LYSO crystals coupled with an 8 x 8 SiPM (Jseries, from ON Semiconductor) array which is under 22Na point source excitation. Initial test results indicate that the energy resolution of the two detectors after correction is around 13.2% and 13.5 % at 511 keV, and the profiles of the flood histograms show a clear visualization of the discrete scintillator element. All measurements were carried out at room temperature (~25 degree), without additional cooling

Instrumentation and Detectors

Re-picturing viscoelastic drag-reducing turbulence by introducing dynamics of elasto-inertial turbulence

72 - Zhang Wen-Hua , Zhang Hong-Na , Li Yu-Ke 2021

Recently, the nature of viscoelastic drag-reducing turbulence (DRT), especially maximum drag reduction (MDR) state, has become a focus of controversy. It has long been regarded as polymers-modulated inertial turbulence (IT), but is challenged by the newly proposed concept of elasto-inertial turbulence (EIT). This study is to re-picture DRT in parallel plane channels by introducing dynamics of EIT based on statistical and budget analysis for a series of flow regimes from the onset of DR to EIT. Energy conversion between velocity fluctuations and polymers as well as polymeric pressure redistribution effect are of particular concern, based on which a new energy self-sustaining process (SSP) of DRT is re-pictured. The numerical results indicate that at low Reynolds number (Re), the flow enters laminar regime before EIT-related SSP is formed with the increase of elasticity, whereas, at moderate Re, EIT-related SSP can get involved and survive from being relaminarized. This somehow explains the reason why relaminarization is observed for small Re while the flow directly enters MDR and EIT at moderate Re. Moreover, with the proposed energy picture, the newly discovered phenomenon that the streamwise velocity fluctuations lag behind those in wall-normal direction can be well explained. The re-pictured SSP certainly justify that IT nature is gradually replaced by that of EIT in DRT with the increase of elasticity.

Fluid Dynamics

Evidence of a hidden flux phase in the topological kagome metal CsV$_3$Sb$_5$

90 - Li Yu , Chennan Wang , Yuhang Zhang 2021

Phase transitions governed by spontaneous time reversal symmetry breaking (TRSB) have long been sought in many quantum systems, including materials with anomalous Hall effect (AHE), cuprate high temperature superconductors, Iridates and so on. However, experimentally identifying such a phase transition is extremely challenging because the transition is hidden from many experimental probes. Here, using zero-field muon spin relaxation (ZF-$mu$SR) technique, we observe strong TRSB signals below 70 K in the newly discovered kagome superconductor CsV$_3$Sb$_5$. The TRSB state emerges from the 2 x 2 charge density wave (CDW) phase present below ~ 95 K. By carrying out optical second-harmonic generation (SHG) experiments, we also find that inversion symmetry is maintained in the temperature range of interest. Combining all the experimental results and symmetry constraints, we conclude that the interlayer coupled chiral flux phase (CFP) is the most promising candidate for the TRSB state among all theoretical proposals of orbital current orders. Thus, this prototypical kagome metal CsV3Sb5 can be a platform to establish a TRSB current-ordered state and explore its relationship with CDW, giant AHE, and superconductivity.

Superconductivity Strongly Correlated Electrons

VOLO: Vision Outlooker for Visual Recognition

244 - Li Yuan , Qibin Hou , Zihang Jiang 2021

Visual recognition has been dominated by convolutional neural networks (CNNs) for years. Though recently the prevailing vision transformers (ViTs) have shown great potential of self-attention based models in ImageNet classification, their performance is still inferior to that of the latest SOTA CNNs if no extra data are provided. In this work, we try to close the performance gap and demonstrate that attention-based models are indeed able to outperform CNNs. We find a major factor limiting the performance of ViTs for ImageNet classification is their low efficacy in encoding fine-level features into the token representations. To resolve this, we introduce a novel outlook attention and present a simple and general architecture, termed Vision Outlooker (VOLO). Unlike self-attention that focuses on global dependency modeling at a coarse level, the outlook attention efficiently encodes finer-level features and contexts into tokens, which is shown to be critically beneficial to recognition performance but largely ignored by the self-attention. Experiments show that our VOLO achieves 87.1% top-1 accuracy on ImageNet-1K classification, which is the first model exceeding 87% accuracy on this competitive benchmark, without using any extra training data In addition, the pre-trained VOLO transfers well to downstream tasks, such as semantic segmentation. We achieve 84.3% mIoU score on the cityscapes validation set and 54.3% on the ADE20K validation set. Code is available at url{https://github.com/sail-sg/volo}.

Computer Vision and Pattern Recognition

Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition

190 - Qibin Hou , Zihang Jiang , Li Yuan 2021

In this paper, we present Vision Permutator, a conceptually simple and data efficient MLP-like architecture for visual recognition. By realizing the importance of the positional information carried by 2D feature representations, unlike recent MLP-like models that encode the spatial information along the flattened spatial dimensions, Vision Permutator separately encodes the feature representations along the height and width dimensions with linear projections. This allows Vision Permutator to capture long-range dependencies along one spatial direction and meanwhile preserve precise positional information along the other direction. The resulting position-sensitive outputs are then aggregated in a mutually complementing manner to form expressive representations of the objects of interest. We show that our Vision Permutators are formidable competitors to convolutional neural networks (CNNs) and vision transformers. Without the dependence on spatial convolutions or attention mechanisms, Vision Permutator achieves 81.5% top-1 accuracy on ImageNet without extra large-scale training data (e.g., ImageNet-22k) using only 25M learnable parameters, which is much better than most CNNs and vision transformers under the same model size constraint. When scaling up to 88M, it attains 83.2% top-1 accuracy. We hope this work could encourage research on rethinking the way of encoding spatial information and facilitate the development of MLP-like models. Code is available at https://github.com/Andrew-Qibin/VisionPermutator.

Computer Vision and Pattern Recognition

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد