No Arabic abstract
Annotating multiple organs in 3D medical images is time-consuming and costly. Meanwhile, there exist many single-organ datasets with one specific organ annotated. This paper investigates how to learn a multi-organ segmentation model leveraging a set of binary-labeled datasets. A novel Multi-teacher Single-student Knowledge Distillation (MS-KD) framework is proposed, where the teacher models are pre-trained single-organ segmentation networks, and the student model is a multi-organ segmentation network. Considering that each teacher focuses on different organs, a region-based supervision method, consisting of logits-wise supervision and feature-wise supervision, is proposed. Each teacher supervises the student in two regions, the organ region where the teacher is considered as an expert and the background region where all teachers agree. Extensive experiments on three public single-organ datasets and a multi-organ dataset have demonstrated the effectiveness of the proposed MS-KD framework.
Due to the intensive cost of labor and expertise in annotating 3D medical images at a voxel level, most benchmark datasets are equipped with the annotations of only one type of organs and/or tumors, resulting in the so-called partially labeling issue. To address this, we propose a dynamic on-demand network (DoDNet) that learns to segment multiple organs and tumors on partially labeled datasets. DoDNet consists of a shared encoder-decoder architecture, a task encoding module, a controller for generating dynamic convolution filters, and a single but dynamic segmentation head. The information of the current segmentation task is encoded as a task-aware prior to tell the model what the task is expected to solve. Different from existing approaches which fix kernels after training, the kernels in dynamic head are generated adaptively by the controller, conditioned on both input image and assigned task. Thus, DoDNet is able to segment multiple organs and tumors, as done by multiple networks or a multi-head network, in a much efficient and flexible manner. We have created a large-scale partially labeled dataset, termed MOTS, and demonstrated the superior performance of our DoDNet over other competitors on seven organ and tumor segmentation tasks. We also transferred the weights pre-trained on MOTS to a downstream multi-organ segmentation task and achieved state-of-the-art performance. This study provides a general 3D medical image segmentation model that has been pre-trained on a large-scale partially labelled dataset and can be extended (after fine-tuning) to downstream volumetric medical data segmentation tasks. The dataset and code areavailableat: https://git.io/DoDNet
Multi-organ segmentation has extensive applications in many clinical applications. To segment multiple organs of interest, it is generally quite difficult to collect full annotations of all the organs on the same images, as some medical centers might only annotate a portion of the organs due to their own clinical practice. In most scenarios, one might obtain annotations of a single or a few organs from one training set, and obtain annotations of the the other organs from another set of training images. Existing approaches mostly train and deploy a single model for each subset of organs, which are memory intensive and also time inefficient. In this paper, we propose to co-train weight-averaged models for learning a unified multi-organ segmentation network from few-organ datasets. We collaboratively train two networks and let the coupled networks teach each other on un-annotated organs. To alleviate the noisy teaching supervisions between the networks, the weighted-averaged models are adopted to produce more reliable soft labels. In addition, a novel region mask is utilized to selectively apply the consistent constraint on the un-annotated organ regions that require collaborative teaching, which further boosts the performance. Extensive experiments on three public available single-organ datasets LiTS, KiTS, Pancreas and manually-constructed single-organ datasets from MOBA show that our method can better utilize the few-organ datasets and achieves superior performance with less inference computational cost.
Accurate and robust segmentation of abdominal organs on CT is essential for many clinical applications such as computer-aided diagnosis and computer-aided surgery. But this task is challenging due to the weak boundaries of organs, the complexity of the background, and the variable sizes of different organs. To address these challenges, we introduce a novel framework for multi-organ segmentation by using organ-attention networks with reverse connections (OAN-RCs) which are applied to 2D views, of the 3D CT volume, and output estimates which are combined by statistical fusion exploiting structural similarity. OAN is a two-stage deep convolutional network, where deep network features from the first stage are combined with the original image, in a second stage, to reduce the complex background and enhance the discriminative information for the target organs. RCs are added to the first stage to give the lower layers semantic information thereby enabling them to adapt to the sizes of different organs. Our networks are trained on 2D views enabling us to use holistic information and allowing efficient computation. To compensate for the limited cross-sectional information of the original 3D volumetric CT, multi-sectional images are reconstructed from the three different 2D view directions. Then we combine the segmentation results from the different views using statistical fusion, with a novel term relating the structural similarity of the 2D views to the original 3D structure. To train the network and evaluate results, 13 structures were manually annotated by four human raters and confirmed by a senior expert on 236 normal cases. We tested our algorithm and computed Dice-Sorensen similarity coefficients and surface distances for evaluating our estimates of the 13 structures. Our experiments show that the proposed approach outperforms 2D- and 3D-patch based state-of-the-art methods.
Lesion detection is an important problem within medical imaging analysis. Most previous work focuses on detecting and segmenting a specialized category of lesions (e.g., lung nodules). However, in clinical practice, radiologists are responsible for finding all possible types of anomalies. The task of universal lesion detection (ULD) was proposed to address this challenge by detecting a large variety of lesions from the whole body. There are multiple heterogeneously labeled datasets with varying label completeness: DeepLesion, the largest dataset of 32,735 annotated lesions of various types, but with even more missing annotation instances; and several fully-labeled single-type lesion datasets, such as LUNA for lung nodules and LiTS for liver tumors. In this work, we propose a novel framework to leverage all these datasets together to improve the performance of ULD. First, we learn a multi-head multi-task lesion detector using all datasets and generate lesion proposals on DeepLesion. Second, missing annotations in DeepLesion are retrieved by a new method of embedding matching that exploits clinical prior knowledge. Last, we discover suspicious but unannotated lesions using knowledge transfer from single-type lesion detectors. In this way, reliable positive and negative regions are obtained from partially-labeled and unlabeled images, which are effectively utilized to train ULD. To assess the clinically realistic protocol of 3D volumetric ULD, we fully annotated 1071 CT sub-volumes in DeepLesion. Our method outperforms the current state-of-the-art approach by 29% in the metric of average sensitivity.
Most existing approaches to train a unified multi-organ segmentation model from several single-organ datasets require simultaneously access multiple datasets during training. In the real scenarios, due to privacy and ethics concerns, the training data of the organs of interest may not be publicly available. To this end, we investigate a data-free incremental organ segmentation scenario and propose a novel incremental training framework to solve it. We use the pretrained model instead of its own training data for privacy protection. Specifically, given a pretrained $K$ organ segmentation model and a new single-organ dataset, we train a unified $K+1$ organ segmentation model without accessing any data belonging to the previous training stages. Our approach consists of two parts: the background label alignment strategy and the uncertainty-aware guidance strategy. The first part is used for knowledge transfer from the pretained model to the training model. The second part is used to extract the uncertainty information from the pretrained model to guide the whole knowledge transfer process. By combing these two strategies, more reliable information is extracted from the pretrained model without original training data. Experiments on multiple publicly available pretrained models and a multi-organ dataset MOBA have demonstrated the effectiveness of our framework.