أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Lei Li

Multi-Constraint Shortest Path using Forest Hop Labeling

144 - Ziyi Liu , Lei Li , Mengxuan Zhang 2021

The textit{Multi-Constraint Shortest Path (MCSP)} problem aims to find the shortest path between two nodes in a network subject to a given constraint set. It is typically processed as a textit{skyline path} problem. However, the number of intermediat e skyline paths becomes larger as the network size increases and the constraint number grows, which brings about the dramatical growth of computational cost and further makes the existing index-based methods hardly capable of obtaining the complete exact results. In this paper, we propose a novel high-dimensional skyline path concatenation method to avoid the expensive skyline path search, which then supports the efficient construction of hop labeling index for textit{MCSP} queries. Specifically, a set of insightful observations and techniques are proposed to improve the efficiency of concatenating two skyline path set, a textit{n-Cube} technique is designed to prune the concatenation space among multiple hops, and a textit{constraint pruning} method is used to avoid the unnecessary computation. Furthermore, to scale up to larger networks, we propose a novel textit{forest hop labeling} which enables the parallel label construction from different network partitions. Our approach is the first method that can achieve both accuracy and efficiency for textit{MCSP} query answering. Extensive experiments on real-life road networks demonstrate the superiority of our method over the state-of-the-art solutions.

بنى وهياكل البيانات والخوارزميات

A Three-Stage Learning Framework for Low-Resource Knowledge-Grounded Dialogue Generation

181 - Shilei Liu , Xiaofeng Zhao , Bochao Li 2021

Neural conversation models have shown great potentials towards generating fluent and informative responses by introducing external background knowledge. Nevertheless, it is laborious to construct such knowledge-grounded dialogues, and existing models usually perform poorly when transfer to new domains with limited training samples. Therefore, building a knowledge-grounded dialogue system under the low-resource setting is a still crucial issue. In this paper, we propose a novel three-stage learning framework based on weakly supervised learning which benefits from large scale ungrounded dialogues and unstructured knowledge base. To better cooperate with this framework, we devise a variant of Transformer with decoupled decoder which facilitates the disentangled learning of response generation and knowledge incorporation. Evaluation results on two benchmarks indicate that our approach can outperform other state-of-the-art methods with less training data, and even in zero-resource scenario, our approach still performs well.

الحساب واللغة الذكاء الاصطناعي

Inferior Gap Between Primes

106 - Chunlei Liu 2021

It is proven that there are infinitely prime pairs whose difference is no greater than 20.

نظرية الأعداد

Right Ventricular Segmentation from Short- and Long-Axis MRIs via Information Transition

142 - Lei Li , Wangbin Ding , Liqun Huang 2021

Right ventricular (RV) segmentation from magnetic resonance imaging (MRI) is a crucial step for cardiac morphology and function analysis. However, automatic RV segmentation from MRI is still challenging, mainly due to the heterogeneous intensity, the complex variable shapes, and the unclear RV boundary. Moreover, current methods for the RV segmentation tend to suffer from performance degradation at the basal and apical slices of MRI. In this work, we propose an automatic RV segmentation framework, where the information from long-axis (LA) views is utilized to assist the segmentation of short-axis (SA) views via information transition. Specifically, we employed the transformed segmentation from LA views as a prior information, to extract the ROI from SA views for better segmentation. The information transition aims to remove the surrounding ambiguous regions in the SA views. %, such as the tricuspid valve regions. We tested our model on a public dataset with 360 multi-center, multi-vendor and multi-disease subjects that consist of both LA and SA MRIs. Our experimental results show that including LA views can be effective to improve the accuracy of the SA segmentation. Our model is publicly available at https://github.com/NanYoMy/MMs-2.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط

GTG-Shapley: Efficient and Accurate Participant Contribution Evaluation in Federated Learning

87 - Zelei Liu , Yuanyuan Chen , Han Yu 2021

Federated Learning (FL) bridges the gap between collaborative machine learning and preserving data privacy. To sustain the long-term operation of an FL ecosystem, it is important to attract high quality data owners with appropriate incentive schemes. As an important building block of such incentive schemes, it is essential to fairly evaluate participants contribution to the performance of the final FL model without exposing their private data. Shapley Value (SV)-based techniques have been widely adopted to provide fair evaluation of FL participant contributions. However, existing approaches incur significant computation costs, making them difficult to apply in practice. In this paper, we propose the Guided Truncation Gradient Shapley (GTG-Shapley) approach to address this challenge. It reconstructs FL models from gradient updates for SV calculation instead of repeatedly training with different combinations of FL participants. In addition, we design a guided Monte Carlo sampling approach combined with within-round and between-round truncation to further reduce the number of model reconstructions and evaluations required, through extensive experiments under diverse realistic data distribution settings. The results demonstrate that GTG-Shapley can closely approximate actual Shapley values, while significantly increasing computational efficiency compared to the state of the art, especially under non-i.i.d. settings.

الذكاء الاصطناعي

Knowledge-Grounded Dialogue with Reward-Driven Knowledge Selection

105 - Shilei Liu , Xiaofeng Zhao , Bochao Li 2021

Knowledge-grounded dialogue is a task of generating a fluent and informative response based on both conversation context and a collection of external knowledge, in which knowledge selection plays an important role and attracts more and more research interest. However, most existing models either select only one knowledge or use all knowledge for responses generation. The former may lose valuable information in discarded knowledge, while the latter may bring a lot of noise. At the same time, many approaches need to train the knowledge selector with knowledge labels that indicate ground-truth knowledge, but these labels are difficult to obtain and require a large number of manual annotations. Motivated by these issues, we propose Knoformer, a dialogue response generation model based on reinforcement learning, which can automatically select one or more related knowledge from the knowledge pool and does not need knowledge labels during training. Knoformer is evaluated on two knowledge-guided conversation datasets, and achieves state-of-the-art performance.

الحساب واللغة

$D^*$ meson production in jet from combination of charm quark with light one

65 - Chuanhui Jiang , Honglei Li , Shi-Yuan Li 2021

In the framework of the perturbative Quantum Chromodynamics factorization, the cross section of the heavy meson production via the combination of a heavy quark with a light one can be factorized to be the convolution of the combination matrix element , the light quark distribution function, and the hard partonic sub-cross section of the heavy quark production. The partonic distribution and the combination matrix element are functions of a scaling variable, respectively, which is the momentum fraction of the corresponding quark with respect to the heavy meson. We studied the $D^{*pm}$ production in jet via combination in pp collision at the LHC. Our calculation can be summed with the fragmentation contribution, and the total result is comparable with the experimental data. The combination matrix elements can be further studied in various hadron production processes.

فيزياء الطاقة العالية - الظواهر

Capacity Optimality of OAMP: Beyond IID Sensing Matrices and Gaussian Signaling

63 - Lei Liu , Shansuo Liang , Li Ping 2021

This paper studies a large unitarily invariant system (LUIS) involving a unitarily invariant sensing matrix, an arbitrary signal distribution, and forward error control (FEC) coding. We develop a universal Gram-Schmidt orthogonalization for orthogona l approximate message passing (OAMP). Numerous area properties are established based on the state evolution and minimum mean squared error (MMSE) property of OAMP in an un-coded LUIS. As a byproduct, we provide an alternative derivation for the constrained capacity of a LUIS. Under the assumption that the state evolution for OAMP is correct for the coded system, the achievable rate of OAMP is analyzed. We prove that OAMP achieves the constrained capacity of the LUIS with an arbitrary signal distribution provided that a matching condition is satisfied. Meanwhile, we elaborate a capacity-achieving coding principle for LUIS, based on which irregular low-density parity-check (LDPC) codes are optimized for binary signaling in the numerical results. We show that OAMP with the optimized codes has significant performance improvement over the un-optimized ones and the well-known Turbo linear MMSE algorithm. For quadrature phase-shift keying (QPSK) modulation, capacity-approaching bit error rate (BER) performances are observed under various channel conditions.

نظرية المعلومات معالجة الإشارات نظرية المعلومات

VOLKS2: a transient search and localization pipeline for VLBI observations

130 - Lei Liu , Zhijun Xu , Zhen Yan 2021

We present VOLKS2, the second release of VLBI Observation for transient Localization Keen Searcher. The pipeline aims at transient search in regular VLBI observations as well as detection of single pulses from known sources in dedicated VLBI observat ions. The underlying method takes the idea of geodetic VLBI data processing, including fringe fitting to maximize the signal power and geodetic VLBI solving for localization. By filtering the candidate signals with multiple windows within a baseline and by cross matching with multiple baselines, RFIs are eliminated effectively. Unlike the station auto spectrum based method, RFI flagging is not required in the VOLKS2 pipeline. EVN observation (EL060) is carried out, so as to verify the pipelines detection efficiency and localization accuracy in the whole FoV. The pipeline is parallelized with MPI and further accelerated with GPU, so as to exploit the hardware resources of modern GPU clusters. We can prove that, with proper optimization, VOLKS2 could achieve comparable performance as auto spectrum based pipelines. All the code and documents are publicly available, in the hope that our pipeline is useful for radio transient studies.

الأجهزة والأساليب للزيئات الفيزياء الفلكية

UPDesc: Unsupervised Point Descriptor Learning for Robust Registration

175 - Lei Li , Hongbo Fu , Maks Ovsjanikov 2021

In this work, we propose UPDesc, an unsupervised method to learn point descriptors for robust point cloud registration. Our work builds upon a recent supervised 3D CNN-based descriptor extraction framework, namely, 3DSmoothNet, which leverages a voxe l-based representation to parameterize the surrounding geometry of interest points. Instead of using a predefined fixed-size local support in voxelization, which potentially limits the access of richer local geometry information, we propose to learn the support size in a data-driven manner. To this end, we design a differentiable voxelization module that can back-propagate gradients to the support size optimization. To optimize descriptor similarity, the prior 3D CNN work and other supervised methods require abundant correspondence labels or pose annotations of point clouds for crafting metric learning losses. Differently, we show that unsupervised learning of descriptor similarity can be achieved by performing geometric registration in networks. Our learning objectives consider descriptor similarity both across and within point clouds without supervision. Through extensive experiments on point cloud registration benchmarks, we show that our learned descriptors yield superior performance over existing unsupervised methods.

الرؤية الحاسوبية وتمييز الأنماط الرسم الحاسوبي

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد