أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Lei Xing

MHFC: Multi-Head Feature Collaboration for Few-Shot Learning

107 - Shuai Shao , Lei Xing , Yan Wang 2021

Few-shot learning (FSL) aims to address the data-scarce problem. A standard FSL framework is composed of two components: (1) Pre-train. Employ the base data to generate a CNN-based feature extraction model (FEM). (2) Meta-test. Apply the trained FEM to acquire the novel datas features and recognize them. FSL relies heavily on the design of the FEM. However, various FEMs have distinct emphases. For example, several may focus more attention on the contour information, whereas others may lay particular emphasis on the texture information. The single-head feature is only a one-sided representation of the sample. Besides the negative influence of cross-domain (e.g., the trained FEM can not adapt to the novel class flawlessly), the distribution of novel data may have a certain degree of deviation compared with the ground truth distribution, which is dubbed as distribution-shift-problem (DSP). To address the DSP, we propose Multi-Head Feature Collaboration (MHFC) algorithm, which attempts to project the multi-head features (e.g., multiple features extracted from a variety of FEMs) to a unified space and fuse them to capture more discriminative information. Typically, first, we introduce a subspace learning method to transform the multi-head features to aligned low-dimensional representations. It corrects the DSP via learning the feature with more powerful discrimination and overcomes the problem of inconsistent measurement scales from different head features. Then, we design an attention block to update combination weights for each head feature automatically. It comprehensively considers the contribution of various perspectives and further improves the discrimination of features. We evaluate the proposed method on five benchmark datasets (including cross-domain experiments) and achieve significant improvements of 2.1%-7.8% compared with state-of-the-arts.

الرؤية الحاسوبية وتمييز الأنماط

CIM: Class-Irrelevant Mapping for Few-Shot Classification

232 - Shuai Shao , Lei Xing , Yixin Chen 2021

Few-shot classification (FSC) is one of the most concerned hot issues in recent years. The general setting consists of two phases: (1) Pre-train a feature extraction model (FEM) with base data (has large amounts of labeled samples). (2) Use the FEM t o extract the features of novel data (with few labeled samples and totally different categories from base data), then classify them with the to-be-designed classifier. The adaptability of pre-trained FEM to novel data determines the accuracy of novel features, thereby affecting the final classification performances. To this end, how to appraise the pre-trained FEM is the most crucial focus in the FSC community. It sounds like traditional Class Activate Mapping (CAM) based methods can achieve this by overlaying weighted feature maps. However, due to the particularity of FSC (e.g., there is no backpropagation when using the pre-trained FEM to extract novel features), we cannot activate the feature map with the novel classes. To address this challenge, we propose a simple, flexible method, dubbed as Class-Irrelevant Mapping (CIM). Specifically, first, we introduce dictionary learning theory and view the channels of the feature map as the bases in a dictionary. Then we utilize the feature map to fit the feature vector of an image to achieve the corresponding channel weights. Finally, we overlap the weighted feature map for visualization to appraise the ability of pre-trained FEM on novel data. For fair use of CIM in evaluating different models, we propose a new measurement index, called Feature Localization Accuracy (FLA). In experiments, we first compare our CIM with CAM in regular tasks and achieve outstanding performances. Next, we use our CIM to appraise several classical FSC frameworks without considering the classification results and discuss them.

الرؤية الحاسوبية وتمييز الأنماط

NeRP: Implicit Neural Representation Learning with Prior Embedding for Sparsely Sampled Image Reconstruction

109 - Liyue Shen , John Pauly , Lei Xing 2021

Image reconstruction is an inverse problem that solves for a computational image based on sampled sensor measurement. Sparsely sampled image reconstruction poses addition challenges due to limited measurements. In this work, we propose an implicit Ne ural Representation learning methodology with Prior embedding (NeRP) to reconstruct a computational image from sparsely sampled measurements. The method differs fundamentally from previous deep learning-based image reconstruction approaches in that NeRP exploits the internal information in an image prior, and the physics of the sparsely sampled measurements to produce a representation of the unknown subject. No large-scale data is required to train the NeRP except for a prior image and sparsely sampled measurements. In addition, we demonstrate that NeRP is a general methodology that generalizes to different imaging modalities such as CT and MRI. We also show that NeRP can robustly capture the subtle yet significant image changes required for assessing tumor progression.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط

Operator Splitting for Adaptive Radiation Therapy with Nonlinear Health Dynamics

197 - Anqi Fu , Lei Xing , Stephen Boyd 2021

We present an optimization-based approach to radiation treatment planning over time. Our approach formulates treatment planning as an optimal control problem with nonlinear patient health dynamics derived from the standard linear-quadratic cell survi val model. As the formulation is nonconvex, we propose a method for obtaining an approximate solution by solving a sequence of convex optimization problems. This method is fast, efficient, and robust to model error, adapting readily to changes in the patients health between treatment sessions. Moreover, we show that it can be combined with the operator splitting method ADMM to produce an algorithm that is highly scalable and can handle large clinical cases. We introduce an open-source Python implementation of our algorithm, AdaRad, and demonstrate its performance on several examples.

الفيزياء الطبية الهندسة الحاسوبية، المالية،العلوم التحسين والتحكم

DualNorm-UNet: Incorporating Global and Local Statistics for Robust Medical Image Segmentation

101 - Junfei Xiao , Lequan Yu , Lei Xing 2021

Batch Normalization (BN) is one of the key components for accelerating network training, and has been widely adopted in the medical image analysis field. However, BN only calculates the global statistics at the batch level, and applies the same affin e transformation uniformly across all spatial coordinates, which would suppress the image contrast of different semantic structures. In this paper, we propose to incorporate the semantic class information into normalization layers, so that the activations corresponding to different regions (i.e., classes) can be modulated differently. We thus develop a novel DualNorm-UNet, to concurrently incorporate both global image-level statistics and local region-wise statistics for network normalization. Specifically, the local statistics are integrated by adaptively modulating the activations along different class regions via the learned semantic masks in the normalization layer. Compared with existing methods, our approach exploits semantic knowledge at normalization and yields more discriminative features for robust segmentation results. More importantly, our network demonstrates superior abilities in capturing domain-invariant information from multiple domains (institutions) of medical data. Extensive experiments show that our proposed DualNorm-UNet consistently improves the performance on various segmentation tasks, even in the face of more complex and variable data distributions. Code is available at https://github.com/lambert-x/DualNorm-Unet.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط

Inducing Hierarchical Compositional Model by Sparsifying Generator Network

57 - Xianglei Xing , Tianfu Wu , Song-Chun Zhu 2019

This paper proposes to learn hierarchical compositional AND-OR model for interpretable image synthesis by sparsifying the generator network. The proposed method adopts the scene-objects-parts-subparts-primitives hierarchy in image representation. A s cene has different types (i.e., OR) each of which consists of a number of objects (i.e., AND). This can be recursively formulated across the scene-objects-parts-subparts hierarchy and is terminated at the primitive level (e.g., wavelets-like basis). To realize this AND-OR hierarchy in image synthesis, we learn a generator network that consists of the following two components: (i) Each layer of the hierarchy is represented by an over-complete set of convolutional basis functions. Off-the-shelf convolutional neural architectures are exploited to implement the hierarchy. (ii) Sparsity-inducing constraints are introduced in end-to-end training, which induces a sparsely activated and sparsely connected AND-OR model from the initially densely connected generator network. A straightforward sparsity-inducing constraint is utilized, that is to only allow the top-$k$ basis functions to be activated at each layer (where $k$ is a hyper-parameter). The learned basis functions are also capable of image reconstruction to explain the input images. In experiments, the proposed method is tested on four benchmark datasets. The results show that meaningful and interpretable hierarchical representations are learned with better qualities of image synthesis and reconstruction obtained than baselines.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي معالجة الصور والفيديو

Using Edge-Preserving Algorithm with Non-local Mean for Significantly Improved Image-Domain Material Decomposition in Dual Energy CT

221 - Wei Zhao , Tianye Niu , Lei Xing 2016

Increased noise is a general concern for dual-energy material decomposition. Here, we develop an image-domain material decomposition algorithm for dual-energy CT (DECT) by incorporating an edge-preserving filter into the Local HighlY constrained back PRojection Reconstruction (HYPR-LR) framework. With effective use of the non-local mean, the proposed algorithm, which is referred to as HYPR-NLM, reduces the noise in dual energy decomposition while preserving the accuracy of quantitative measurement and spatial resolution of the material-specific dual energy images. We demonstrate the noise reduction and resolution preservation of the algorithm with iodine concentrate numerical phantom by comparing the HYPR-NLM algorithm to the direct matrix inversion, HYPR-LR and iterative image-domain material decomposition (Iter-DECT). We also show the superior performance of the HYPR-NLM over the existing methods by using two sets of cardiac perfusing imaging data. The reference drawn from the comparison study includes: (1) HYPR-NLM significantly reduces the DECT material decomposition noise while preserving quantitative measurements and high-frequency edge information, and (2) HYPR-NLM is robust with respect to parameter selection.

الفيزياء الطبية

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد