Segmentation of mandibles in CT scans during virtual surgical planning is crucial for 3D surgical planning in order to obtain a detailed surface representation of the patients bone. Automatic segmentation of mandibles in CT scans is a challenging task due to large variation in their shape and size between individuals. In order to address this challenge we propose a convolutional neural network approach for mandible segmentation in CT scans by considering the continuum of anatomical structures through different planes. The proposed convolutional neural network adopts the architecture of the U-Net and then combines the resulting 2D segmentations from three different planes into a 3D segmentation. We implement such a segmentation approach on 11 neck CT scans and then evaluate the performance. We achieve an average dice coefficient of $ 0.89 $ on two testing mandible segmentation. Experimental results show that our proposed approach for mandible segmentation in CT scans exhibits high accuracy.
Recently, accurate mandible segmentation in CT scans based on deep learning methods has attracted much attention. However, there still exist two major challenges, namely, metal artifacts among mandibles and large variations in shape or size among individuals. To address these two challenges, we propose a recurrent segmentation convolutional neural network (RSegCNN) that embeds segmentation convolutional neural network (SegCNN) into the recurrent neural network (RNN) for robust and accurate segmentation of the mandible. Such a design of the system takes into account the similarity and continuity of the mandible shapes captured in adjacent image slices in CT scans. The RSegCNN infers the mandible information based on the recurrent structure with the embedded encoder-decoder segmentation (SegCNN) components. The recurrent structure guides the system to exploit relevant and important information from adjacent slices, while the SegCNN component focuses on the mandible shapes from a single CT slice. We conducted extensive experiments to evaluate the proposed RSegCNN on two head and neck CT datasets. The experimental results show that the RSegCNN is significantly better than the state-of-the-art models for accurate mandible segmentation.
Accurate segmentation for medical images is important for clinical diagnosis. Existing automatic segmentation methods are mainly based on fully supervised learning and have an extremely high demand for precise annotations, which are very costly and time-consuming to obtain. To address this problem, we proposed an automatic CT segmentation method based on weakly supervised learning, by which one could train an accurate segmentation model only with weak annotations in the form of bounding boxes. The proposed method is composed of two steps: 1) generating pseudo masks with bounding box annotations by k-means clustering, and 2) iteratively training a 3D U-Net convolutional neural network as a segmentation model. Some data pre-processing methods are used to improve performance. The method was validated on four datasets containing three types of organs with a total of 627 CT volumes. For liver, spleen and kidney segmentation, it achieved an accuracy of 95.19%, 92.11%, and 91.45%, respectively. Experimental results demonstrate that our method is accurate, efficient, and suitable for clinical use.
Response evaluation criteria in solid tumors (RECIST) is the standard measurement for tumor extent to evaluate treatment responses in cancer patients. As such, RECIST annotations must be accurate. However, RECIST annotations manually labeled by radiologists require professional knowledge and are time-consuming, subjective, and prone to inconsistency among different observers. To alleviate these problems, we propose a cascaded convolutional neural network based method to semi-automatically label RECIST annotations and drastically reduce annotation time. The proposed method consists of two stages: lesion region normalization and RECIST estimation. We employ the spatial transformer network (STN) for lesion region normalization, where a localization network is designed to predict the lesion region and the transformation parameters with a multi-task learning strategy. For RECIST estimation, we adapt the stacked hourglass network (SHN), introducing a relationship constraint loss to improve the estimation precision. STN and SHN can both be learned in an end-to-end fashion. We train our system on the DeepLesion dataset, obtaining a consensus model trained on RECIST annotations performed by multiple radiologists over a multi-year period. Importantly, when judged against the inter-reader variability of two additional radiologist raters, our system performs more stably and with less variability, suggesting that RECIST annotations can be reliably obtained with reduced labor and time.
This paper proposes a new convolutional neural network with multiscale processing for detecting ground-glass opacity (GGO) nodules in 3D computed tomography (CT) images, which is referred to as PiaNet for short. PiaNet consists of a feature-extraction module and a prediction module. The former module is constructed by introducing pyramid multiscale source connections into a contracting-expanding structure. The latter module includes a bounding-box regressor and a classifier that are employed to simultaneously recognize GGO nodules and estimate bounding boxes at multiple scales. To train the proposed PiaNet, a two-stage transfer learning strategy is developed. In the first stage, the feature-extraction module is embedded into a classifier network that is trained on a large data set of GGO and non-GGO patches, which are generated by performing data augmentation from a small number of annotated CT scans. In the second stage, the pretrained feature-extraction module is loaded into PiaNet, and then PiaNet is fine-tuned using the annotated CT scans. We evaluate the proposed PiaNet on the LIDC-IDRI data set. The experimental results demonstrate that our method outperforms state-of-the-art counterparts, including the Subsolid CAD and Aidence systems and S4ND and GA-SSD methods. PiaNet achieves a sensitivity of 91.75% with only one false positive per scan
Automatic segmentation of the liver and its lesion is an important step towards deriving quantitative biomarkers for accurate clinical diagnosis and computer-aided decision support systems. This paper presents a method to automatically segment liver and lesions in CT abdomen images using cascaded fully convolutional neural networks (CFCNs) and dense 3D conditional random fields (CRFs). We train and cascade two FCNs for a combined segmentation of the liver and its lesions. In the first step, we train a FCN to segment the liver as ROI input for a second FCN. The second FCN solely segments lesions from the predicted liver ROIs of step 1. We refine the segmentations of the CFCN using a dense 3D CRF that accounts for both spatial coherence and appearance. CFCN models were trained in a 2-fold cross-validation on the abdominal CT dataset 3DIRCAD comprising 15 hepatic tumor volumes. Our results show that CFCN-based semantic liver and lesion segmentation achieves Dice scores over 94% for liver with computation times below 100s per volume. We experimentally demonstrate the robustness of the proposed method as a decision support system with a high accuracy and speed for usage in daily clinical routine.