No Arabic abstract
Deformable image registration between Computed Tomography (CT) images and Magnetic Resonance (MR) imaging is essential for many image-guided therapies. In this paper, we propose a novel translation-based unsupervised deformable image registration method. Distinct from other translation-based methods that attempt to convert the multimodal problem (e.g., CT-to-MR) into a unimodal problem (e.g., MR-to-MR) via image-to-image translation, our method leverages the deformation fields estimated from both: (i) the translated MR image and (ii) the original CT image in a dual-stream fashion, and automatically learns how to fuse them to achieve better registration performance. The multimodal registration network can be effectively trained by computationally efficient similarity metrics without any ground-truth deformation. Our method has been evaluated on two clinical datasets and demonstrates promising results compared to state-of-the-art traditional and learning-based methods.
Multi-modal magnetic resonance imaging (MRI) is essential in clinics for comprehensive diagnosis and surgical planning. Nevertheless, the segmentation of multi-modal MR images tends to be time-consuming and challenging. Convolutional neural network (CNN)-based multi-modal MR image analysis commonly proceeds with multiple down-sampling streams fused at one or several layers. Although inspiring performance has been achieved, the feature fusion is usually conducted through simple summation or concatenation without optimization. In this work, we propose a supervised image fusion method to selectively fuse the useful information from different modalities and suppress the respective noise signals. Specifically, an attention block is introduced as guidance for the information selection. From the different modalities, one modality that contributes most to the results is selected as the master modality, which supervises the information selection of the other assistant modalities. The effectiveness of the proposed method is confirmed through breast mass segmentation in MR images of two modalities and better segmentation results are achieved compared to the state-of-the-art methods.
Multi-modal image registration is a challenging problem that is also an important clinical task for many real applications and scenarios. As a first step in analysis, deformable registration among different image modalities is often required in order to provide complementary visual information. During registration, semantic information is key to match homologous points and pixels. Nevertheless, many conventional registration methods are incapable in capturing high-level semantic anatomical dense correspondences. In this work, we propose a novel multi-task learning system, JSSR, based on an end-to-end 3D convolutional neural network that is composed of a generator, a registration and a segmentation component. The system is optimized to satisfy the implicit constraints between different tasks in an unsupervised manner. It first synthesizes the source domain images into the target domain, then an intra-modal registration is applied on the synthesized images and target images. The segmentation module are then applied on the synthesized and target images, providing additional cues based on semantic correspondences. The supervision from another fully-annotated dataset is used to regularize the segmentation. We extensively evaluate JSSR on a large-scale medical image dataset containing 1,485 patient CT imaging studies of four different contrast phases (i.e., 5,940 3D CT scans with pathological livers) on the registration, segmentation and synthesis tasks. The performance is improved after joint training on the registration and segmentation tasks by 0.9% and 1.9% respectively compared to a highly competitive and accurate deep learning baseline. The registration also consistently outperforms conventional state-of-the-art multi-modal registration methods.
We present Free Point Transformer (FPT) - a deep neural network architecture for non-rigid point-set registration. Consisting of two modules, a global feature extraction module and a point transformation module, FPT does not assume explicit constraints based on point vicinity, thereby overcoming a common requirement of previous learning-based point-set registration methods. FPT is designed to accept unordered and unstructured point-sets with a variable number of points and uses a model-free approach without heuristic constraints. Training FPT is flexible and involves minimizing an intuitive unsupervised loss function, but supervised, semi-supervised, and partially- or weakly-supervised training are also supported. This flexibility makes FPT amenable to multimodal image registration problems where the ground-truth deformations are difficult or impossible to measure. In this paper, we demonstrate the application of FPT to non-rigid registration of prostate magnetic resonance (MR) imaging and sparsely-sampled transrectal ultrasound (TRUS) images. The registration errors were 4.71 mm and 4.81 mm for complete TRUS imaging and sparsely-sampled TRUS imaging, respectively. The results indicate superior accuracy to the alternative rigid and non-rigid registration algorithms tested and substantially lower computation time. The rapid inference possible with FPT makes it particularly suitable for applications where real-time registration is beneficial.
We describe an adversarial learning approach to constrain convolutional neural network training for image registration, replacing heuristic smoothness measures of displacement fields often used in these tasks. Using minimally-invasive prostate cancer intervention as an example application, we demonstrate the feasibility of utilizing biomechanical simulations to regularize a weakly-supervised anatomical-label-driven registration network for aligning pre-procedural magnetic resonance (MR) and 3D intra-procedural transrectal ultrasound (TRUS) images. A discriminator network is optimized to distinguish the registration-predicted displacement fields from the motion data simulated by finite element analysis. During training, the registration network simultaneously aims to maximize similarity between anatomical labels that drives image alignment and to minimize an adversarial generator loss that measures divergence between the predicted- and simulated deformation. The end-to-end trained network enables efficient and fully-automated registration that only requires an MR and TRUS image pair as input, without anatomical labels or simulated data during inference. 108 pairs of labelled MR and TRUS images from 76 prostate cancer patients and 71,500 nonlinear finite-element simulations from 143 different patients were used for this study. We show that, with only gland segmentation as training labels, the proposed method can help predict physically plausible deformation without any other smoothness penalty. Based on cross-validation experiments using 834 pairs of independent validation landmarks, the proposed adversarial-regularized registration achieved a target registration error of 6.3 mm that is significantly lower than those from several other regularization methods.
Low-rankness is important in the hyperspectral image (HSI) denoising tasks. The tensor nuclear norm (TNN), defined based on the tensor singular value decomposition, is a state-of-the-art method to describe the low-rankness of HSI. However, TNN ignores some of the physical meanings of HSI in tackling the denoising tasks, leading to suboptimal denoising performance. In this paper, we propose the multi-modal and frequency-weighted tensor nuclear norm (MFWTNN) and the non-convex MFWTNN for HSI denoising tasks. Firstly, we investigate the physical meaning of frequency components and reconsider their weights to improve the low-rank representation ability of TNN. Meanwhile, we also consider the correlation among two spatial dimensions and the spectral dimension of HSI and combine the above improvements to TNN to propose MFWTNN. Secondly, we use non-convex functions to approximate the rank function of the frequency tensor and propose the NonMFWTNN to relax the MFWTNN better. Besides, we adaptively choose bigger weights for slices mainly containing noise information and smaller weights for slices containing profile information. Finally, we develop the efficient alternating direction method of multiplier (ADMM) based algorithm to solve the proposed models, and the effectiveness of our models are substantiated in simulated and real HSI datasets.