أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Jianpeng Zhang

Domain and Content Adaptive Convolution for Domain Generalization in Medical Image Segmentation

110 - Shishuai Hu , Zehui Liao , Jianpeng Zhang 2021

The domain gap caused mainly by variable medical image quality renders a major obstacle on the path between training a segmentation model in the lab and applying the trained model to unseen clinical data. To address this issue, domain generalization methods have been proposed, which however usually use static convolutions and are less flexible. In this paper, we propose a multi-source domain generalization model, namely domain and content adaptive convolution (DCAC), for medical image segmentation. Specifically, we design the domain adaptive convolution (DAC) module and content adaptive convolution (CAC) module and incorporate both into an encoder-decoder backbone. In the DAC module, a dynamic convolutional head is conditioned on the predicted domain code of the input to make our model adapt to the unseen target domain. In the CAC module, a dynamic convolutional head is conditioned on the global image features to make our model adapt to the test image. We evaluated the DCAC model against the baseline and four state-of-the-art domain generalization methods on the prostate segmentation, COVID-19 lesion segmentation, and optic cup/optic disc segmentation tasks. Our results indicate that the proposed DCAC model outperforms all competing methods on each segmentation task, and also demonstrate the effectiveness of the DAC and CAC modules.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط

CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation

81 - Yutong Xie , Jianpeng Zhang , Chunhua Shen 2021

Convolutional neural networks (CNNs) have been the de facto standard for nowadays 3D medical image segmentation. The convolutional operations used in these networks, however, inevitably have limitations in modeling the long-range dependency due to th eir inductive bias of locality and weight sharing. Although Transformer was born to address this issue, it suffers from extreme computational and spatial complexities in processing high-resolution 3D feature maps. In this paper, we propose a novel framework that efficiently bridges a {bf Co}nvolutional neural network and a {bf Tr}ansformer {bf (CoTr)} for accurate 3D medical image segmentation. Under this framework, the CNN is constructed to extract feature representations and an efficient deformable Transformer (DeTrans) is built to model the long-range dependency on the extracted feature maps. Different from the vanilla Transformer which treats all image positions equally, our DeTrans pays attention only to a small set of key positions by introducing the deformable self-attention mechanism. Thus, the computational and spatial complexities of DeTrans have been greatly reduced, making it possible to process the multi-scale and high-resolution feature maps, which are usually of paramount importance for image segmentation. We conduct an extensive evaluation on the Multi-Atlas Labeling Beyond the Cranial Vault (BCV) dataset that covers 11 major human organs. The results indicate that our CoTr leads to a substantial performance improvement over other CNN-based, transformer-based, and hybrid methods on the 3D multi-organ segmentation task. Code is available at defUrlFont{rmsmallttfamily} url{https://github.com/YtongXie/CoTr}

الرؤية الحاسوبية وتمييز الأنماط

Inter-slice Context Residual Learning for 3D Medical Image Segmentation

118 - Jianpeng Zhang , Yutong Xie , Yan Wang 2020

Automated and accurate 3D medical image segmentation plays an essential role in assisting medical professionals to evaluate disease progresses and make fast therapeutic schedules. Although deep convolutional neural networks (DCNNs) have widely applie d to this task, the accuracy of these models still need to be further improved mainly due to their limited ability to 3D context perception. In this paper, we propose the 3D context residual network (ConResNet) for the accurate segmentation of 3D medical images. This model consists of an encoder, a segmentation decoder, and a context residual decoder. We design the context residual module and use it to bridge both decoders at each scale. Each context residual module contains both context residual mapping and context attention mapping, the formal aims to explicitly learn the inter-slice context information and the latter uses such context as a kind of attention to boost the segmentation accuracy. We evaluated this model on the MICCAI 2018 Brain Tumor Segmentation (BraTS) dataset and NIH Pancreas Segmentation (Pancreas-CT) dataset. Our results not only demonstrate the effectiveness of the proposed 3D context residual learning scheme but also indicate that the proposed ConResNet is more accurate than six top-ranking methods in brain tumor segmentation and seven top-ranking methods in pancreas segmentation. Code is available at https://git.io/ConResNet

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط

PGL: Prior-Guided Local Self-supervised Learning for 3D Medical Image Segmentation

100 - Yutong Xie , Jianpeng Zhang , Zehui Liao 2020

It has been widely recognized that the success of deep learning in image segmentation relies overwhelmingly on a myriad amount of densely annotated training data, which, however, are difficult to obtain due to the tremendous labor and expertise requi red, particularly for annotating 3D medical images. Although self-supervised learning (SSL) has shown great potential to address this issue, most SSL approaches focus only on image-level global consistency, but ignore the local consistency which plays a pivotal role in capturing structural information for dense prediction tasks such as segmentation. In this paper, we propose a PriorGuided Local (PGL) self-supervised model that learns the region-wise local consistency in the latent feature space. Specifically, we use the spatial transformations, which produce different augmented views of the same image, as a prior to deduce the location relation between two views, which is then used to align the feature maps of the same local region but being extracted on two views. Next, we construct a local consistency loss to minimize the voxel-wise discrepancy between the aligned feature maps. Thus, our PGL model learns the distinctive representations of local regions, and hence is able to retain structural information. This ability is conducive to downstream segmentation tasks. We conducted an extensive evaluation on four public computerized tomography (CT) datasets that cover 11 kinds of major human organs and two tumors. The results indicate that using pre-trained PGL model to initialize a downstream network leads to a substantial performance improvement over both random initialization and the initialization with global consistency-based models. Code and pre-trained weights will be made available at: https://git.io/PGL.

الرؤية الحاسوبية وتمييز الأنماط

DoDNet: Learning to segment multi-organ and tumors from multiple partially labeled datasets

158 - Jianpeng Zhang , Yutong Xie , Yong Xia 2020

Due to the intensive cost of labor and expertise in annotating 3D medical images at a voxel level, most benchmark datasets are equipped with the annotations of only one type of organs and/or tumors, resulting in the so-called partially labeling issue . To address this, we propose a dynamic on-demand network (DoDNet) that learns to segment multiple organs and tumors on partially labeled datasets. DoDNet consists of a shared encoder-decoder architecture, a task encoding module, a controller for generating dynamic convolution filters, and a single but dynamic segmentation head. The information of the current segmentation task is encoded as a task-aware prior to tell the model what the task is expected to solve. Different from existing approaches which fix kernels after training, the kernels in dynamic head are generated adaptively by the controller, conditioned on both input image and assigned task. Thus, DoDNet is able to segment multiple organs and tumors, as done by multiple networks or a multi-head network, in a much efficient and flexible manner. We have created a large-scale partially labeled dataset, termed MOTS, and demonstrated the superior performance of our DoDNet over other competitors on seven organ and tumor segmentation tasks. We also transferred the weights pre-trained on MOTS to a downstream multi-organ segmentation task and achieved state-of-the-art performance. This study provides a general 3D medical image segmentation model that has been pre-trained on a large-scale partially labelled dataset and can be extended (after fine-tuning) to downstream volumetric medical data segmentation tasks. The dataset and code areavailableat: https://git.io/DoDNet

الرؤية الحاسوبية وتمييز الأنماط

Pairwise Relation Learning for Semi-supervised Gland Segmentation

186 - Yutong Xie , Jianpeng Zhang , Zhibin Liao 2020

Accurate and automated gland segmentation on histology tissue images is an essential but challenging task in the computer-aided diagnosis of adenocarcinoma. Despite their prevalence, deep learning models always require a myriad number of densely anno tated training images, which are difficult to obtain due to extensive labor and associated expert costs related to histology image annotations. In this paper, we propose the pairwise relation-based semi-supervised (PRS^2) model for gland segmentation on histology images. This model consists of a segmentation network (S-Net) and a pairwise relation network (PR-Net). The S-Net is trained on labeled data for segmentation, and PR-Net is trained on both labeled and unlabeled data in an unsupervised way to enhance its image representation ability via exploiting the semantic consistency between each pair of images in the feature space. Since both networks share their encoders, the image representation ability learned by PR-Net can be transferred to S-Net to improve its segmentation performance. We also design the object-level Dice loss to address the issues caused by touching glands and combine it with other two loss functions for S-Net. We evaluated our model against five recent methods on the GlaS dataset and three recent methods on the CRAG dataset. Our results not only demonstrate the effectiveness of the proposed PR-Net and object-level Dice loss, but also indicate that our PRS^2 model achieves the state-of-the-art gland segmentation performance on both benchmarks.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي معالجة الصور والفيديو

Viral Pneumonia Screening on Chest X-ray Images Using Confidence-Aware Anomaly Detection

171 - Jianpeng Zhang , Yutong Xie , Guansong Pang 2020

Cluster of viral pneumonia occurrences during a short period of time may be a harbinger of an outbreak or pandemic, like SARS, MERS, and recent COVID-19. Rapid and accurate detection of viral pneumonia using chest X-ray can be significantly useful in large-scale screening and epidemic prevention, particularly when other chest imaging modalities are less available. Viral pneumonia often have diverse causes and exhibit notably different visual appearances on X-ray images. The evolution of viruses and the emergence of novel mutated viruses further result in substantial dataset shift, which greatly limits the performance of classification approaches. In this paper, we formulate the task of differentiating viral pneumonia from non-viral pneumonia and healthy controls into an one-class classification-based anomaly detection problem, and thus propose the confidence-aware anomaly detection (CAAD) model, which consists of a shared feature extractor, an anomaly detection module, and a confidence prediction module. If the anomaly score produced by the anomaly detection module is large enough or the confidence score estimated by the confidence prediction module is small enough, we accept the input as an anomaly case (i.e., viral pneumonia). The major advantage of our approach over binary classification is that we avoid modeling individual viral pneumonia classes explicitly and treat all known viral pneumonia cases as anomalies to reinforce the one-class model. The proposed model outperforms binary classification models on the clinical X-VIRAL dataset that contains 5,977 viral pneumonia (no COVID-19) cases, 18,619 non-viral pneumonia cases, and 18,774 healthy controls.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط

A Sensitivity Analysis of Attention-Gated Convolutional Neural Networks for Sentence Classification

60 - Yang Liu , Jianpeng Zhang , Chao Gao 2019

In this paper, we investigate the effect of different hyperparameters as well as different combinations of hyperparameters settings on the performance of the Attention-Gated Convolutional Neural Networks (AGCNNs), e.g., the kernel window size, the nu mber of feature maps, the keep rate of the dropout layer, and the activation function. We draw practical advice from a wide range of empirical results. Through the sensitivity analysis, we further improve the hyperparameters settings of AGCNNs. Experiments show that our proposals could achieve an average of 0.81% and 0.67% improvements on AGCNN-NLReLU-rand and AGCNN-SELU-rand, respectively; and an average of 0.47% and 0.45% improvements on AGCNN-NLReLU-static and AGCNN-SELU-static, respectively.

الحساب واللغة التعلم الآلي

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد