ترغب بنشر مسار تعليمي؟ اضغط هنا

Training Image Estimators without Image Ground-Truth

87   0   0.0 ( 0 )
 نشر من قبل Ayan Chakrabarti
 تاريخ النشر 2019
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Deep neural networks have been very successful in image estimation applications such as compressive-sensing and image restoration, as a means to estimate images from partial, blurry, or otherwise degraded measurements. These networks are trained on a large number of corresponding pairs of measurements and ground-truth images, and thus implicitly learn to exploit domain-specific image statistics. But unlike measurement data, it is often expensive or impractical to collect a large training set of ground-truth images in many application settings. In this paper, we introduce an unsupervised framework for training image estimation networks, from a training set that contains only measurements---with two varied measurements per image---but no ground-truth for the full images desired as output. We demonstrate that our framework can be applied for both regular and blind image estimation tasks, where in the latter case parameters of the measurement model (e.g., the blur kernel) are unknown: during inference, and potentially, also during training. We evaluate our method for training networks for compressive-sensing and blind deconvolution, considering both non-blind and blind training for the latter. Our unsupervised framework yields models that are nearly as accurate as those from fully supervised training, despite not having access to any ground-truth images.



قيم البحث

اقرأ أيضاً

Deep neural networks for medical image reconstruction are traditionally trained using high-quality ground-truth images as training targets. Recent work onNoise2Noise (N2N) has shown the potential of using multiple noisy measurements of the same objec t as an alternative to having a ground truth. However, existing N2N-based methods cannot exploit information from various motion states, limiting their ability to learn on moving objects. This paper addresses this issue by proposing a novel motion-compensated deep image reconstruction (MoDIR) method that can use information from several unregistered and noisy measurements for training. MoDIR deals with object motion by including a deep registration module jointly trained with the deep reconstruction network without any ground-truth supervision. We validate MoDIR on both simulated and experimentally collected magnetic resonance imaging (MRI) data and show that it significantly improves imaging quality.
Regularization by denoising (RED) is an image reconstruction framework that uses an image denoiser as a prior. Recent work has shown the state-of-the-art performance of RED with learned denoisers corresponding to pre-trained convolutional neural nets (CNNs). In this work, we propose to broaden the current denoiser-centric view of RED by considering priors corresponding to networks trained for more general artifact-removal. The key benefit of the proposed family of algorithms, called regularization by artifact-removal (RARE), is that it can leverage priors learned on datasets containing only undersampled measurements. This makes RARE applicable to problems where it is practically impossible to have fully-sampled groundtruth data for training. We validate RARE on both simulated and experimentally collected data by reconstructing a free-breathing whole-body 3D MRIs into ten respiratory phases from heavily undersampled k-space measurements. Our results corroborate the potential of learning regularizers for iterative inversion directly on undersampled and noisy measurements.
381 - Kai Xuan , Liping Si , Lichi Zhang 2020
High-quality magnetic resonance (MR) image, i.e., with near isotropic voxel spacing, is desirable in various scenarios of medical image analysis. However, many MR acquisitions use large inter-slice spacing in clinical practice. In this work, we propo se a novel deep-learning-based super-resolution algorithm to generate high-resolution (HR) MR images with small slice spacing from low-resolution (LR) inputs of large slice spacing. Notice that most existing deep-learning-based methods need paired LR and HR images to supervise the training, but in clinical scenarios, usually no HR images will be acquired. Therefore, our unique goal herein is to design and train the super-resolution network with no real HR ground-truth. Specifically, two training stages are used in our method. First, HR images of reduced slice spacing are synthesized from real LR images using variational auto-encoder (VAE). Although these synthesized HR images are as realistic as possible, they may still suffer from unexpected morphing induced by VAE, implying that the synthesized HR images cannot be paired with the real LR images in terms of anatomical structure details. In the second stage, we degrade the synthesized HR images to generate corresponding LR images and train a super-resolution network based on these synthesized HR and degraded LR pairs. The underlying mechanism is that such a super-resolution network is less vulnerable to anatomical variability. Experiments on knee MR images successfully demonstrate the effectiveness of our proposed solution to reduce the slice spacing for better rendering.
Multi-focus image fusion, a technique to generate an all-in-focus image from two or more partially-focused source images, can benefit many computer vision tasks. However, currently there is no large and realistic dataset to perform convincing evaluat ion and comparison of algorithms in multi-focus image fusion. Moreover, it is difficult to train a deep neural network for multi-focus image fusion without a suitable dataset. In this letter, we introduce a large and realistic multi-focus dataset called Real-MFF, which contains 710 pairs of source images with corresponding ground truth images. The dataset is generated by light field images, and both the source images and the ground truth images are realistic. To serve as both a well-established benchmark for existing multi-focus image fusion algorithms and an appropriate training dataset for future development of deep-learning-based methods, the dataset contains a variety of scenes, including buildings, plants, humans, shopping malls, squares and so on. We also evaluate 10 typical multi-focus algorithms on this dataset for the purpose of illustration.
A key limitation of deep convolutional neural networks (DCNN) based image segmentation methods is the lack of generalizability. Manually traced training images are typically required when segmenting organs in a new imaging modality or from distinct d isease cohort. The manual efforts can be alleviated if the manually traced images in one imaging modality (e.g., MRI) are able to train a segmentation network for another imaging modality (e.g., CT). In this paper, we propose an end-to-end synthetic segmentation network (SynSeg-Net) to train a segmentation network for a target imaging modality without having manual labels. SynSeg-Net is trained by using (1) unpaired intensity images from source and target modalities, and (2) manual labels only from source modality. SynSeg-Net is enabled by the recent advances of cycle generative adversarial networks (CycleGAN) and DCNN. We evaluate the performance of the SynSeg-Net on two experiments: (1) MRI to CT splenomegaly synthetic segmentation for abdominal images, and (2) CT to MRI total intracranial volume synthetic segmentation (TICV) for brain images. The proposed end-to-end approach achieved superior performance to two stage methods. Moreover, the SynSeg-Net achieved comparable performance to the traditional segmentation network using target modality labels in certain scenarios. The source code of SynSeg-Net is publicly available (https://github.com/MASILab/SynSeg-Net).
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا