Do you want to publish a course? Click here

AlignShift: Bridging the Gap of Imaging Thickness in 3D Anisotropic Volumes

75   0   0.0 ( 0 )
 Added by Jiancheng Yang
 Publication date 2020
and research's language is English




Ask ChatGPT about the research

This paper addresses a fundamental challenge in 3D medical image processing: how to deal with imaging thickness. For anisotropic medical volumes, there is a significant performance gap between thin-slice (mostly 1mm) and thick-slice (mostly 5mm) volumes. Prior arts tend to use 3D approaches for the thin-slice and 2D approaches for the thick-slice, respectively. We aim at a unified approach for both thin- and thick-slice medical volumes. Inspired by recent advances in video analysis, we propose AlignShift, a novel parameter-free operator to convert theoretically any 2D pretrained network into thickness-aware 3D network. Remarkably, the converted networks behave like 3D for the thin-slice, nevertheless degenerate to 2D for the thick-slice adaptively. The unified thickness-aware representation learning is achieved by shifting and fusing aligned virtual slices as per the input imaging thickness. Extensive experiments on public large-scale DeepLesion benchmark, consisting of 32K lesions for universal lesion detection, validate the effectiveness of our method, which outperforms previous state of the art by considerable margins without whistles and bells. More importantly, to our knowledge, this is the first method that bridges the performance gap between thin- and thick-slice volumes by a unified framework. To improve research reproducibility, our code in PyTorch is open source at https://github.com/M3DV/AlignShift.



rate research

Read More

While deep convolutional neural networks (CNN) have been successfully applied for 2D image analysis, it is still challenging to apply them to 3D anisotropic volumes, especially when the within-slice resolution is much higher than the between-slice resolution and when the amount of 3D volumes is relatively small. On one hand, direct learning of CNN with 3D convolution kernels suffers from the lack of data and likely ends up with poor generalization; insufficient GPU memory limits the model size or representational power. On the other hand, applying 2D CNN with generalizable features to 2D slices ignores between-slice information. Coupling 2D network with LSTM to further handle the between-slice information is not optimal due to the difficulty in LSTM learning. To overcome the above challenges, we propose a 3D Anisotropic Hybrid Network (AH-Net) that transfers convolutional features learned from 2D images to 3D anisotropic volumes. Such a transfer inherits the desired strong generalization capability for within-slice information while naturally exploiting between-slice information for more effective modelling. The focal loss is further utilized for more effective end-to-end learning. We experiment with the proposed 3D AH-Net on two different medical image analysis tasks, namely lesion detection from a Digital Breast Tomosynthesis volume, and liver and liver tumor segmentation from a Computed Tomography volume and obtain the state-of-the-art results.
There has been a debate on whether to use 2D or 3D deep neural networks for volumetric organ segmentation. Both 2D and 3D models have their advantages and disadvantages. In this paper, we present an alternative framework, which trains 2D networks on different viewpoints for segmentation, and builds a 3D Volumetric Fusion Net (VFN) to fuse the 2D segmentation results. VFN is relatively shallow and contains much fewer parameters than most 3D networks, making our framework more efficient at integrating 3D information for segmentation. We train and test the segmentation and fusion modules individually, and propose a novel strategy, named cross-cross-augmentation, to make full use of the limited training data. We evaluate our framework on several challenging abdominal organs, and verify its superiority in segmentation accuracy and stability over existing 2D and 3D approaches.
238 - Wenyan Cong , Junyan Cao , Li Niu 2021
Image harmonization has been significantly advanced with large-scale harmonization dataset. However, the current way to build dataset is still labor-intensive, which adversely affects the extendability of dataset. To address this problem, we propose to construct a large-scale rendered harmonization dataset RHHarmony with fewer human efforts to augment the existing real-world dataset. To leverage both real-world images and rendered images, we propose a cross-domain harmonization network CharmNet to bridge the domain gap between two domains. Moreover, we also employ well-designed style classifiers and losses to facilitate cross-domain knowledge transfer. Extensive experiments demonstrate the potential of using rendered images for image harmonization and the effectiveness of our proposed network. Our dataset and code are available at https://github.com/bcmi/Rendered_Image_Harmonization_Datasets.
Our daily human life is filled with a myriad of joint action moments, be it children playing, adults working together (i.e., team sports), or strangers navigating through a crowd. Joint action brings individuals (and embodiment of their emotions) together, in space and in time. Yet little is known about how individual emotions propagate through embodied presence in a group, and how joint action changes individual emotion. In fact, the multi-agent component is largely missing from neuroscience-based approaches to emotion, and reversely joint action research has not found a way yet to include emotion as one of the key parameters to model socio-motor interaction. In this review, we first identify the gap and then stockpile evidence showing strong entanglement between emotion and acting together from various branches of sciences. We propose an integrative approach to bridge the gap, highlight five research avenues to do so in behavioral neuroscience and digital sciences, and address some of the key challenges in the area faced by modern societies.
Although significant advances have been made in the area of human poses estimation from images using deep Convolutional Neural Network (ConvNet), it remains a big challenge to perform 3D pose inference in-the-wild. This is due to the difficulty to obtain 3D pose groundtruth for outdoor environments. In this paper, we propose a novel framework to tackle this problem by exploiting the information of each bone indicating if it is forward or backward with respect to the view of the camera(we term it Forwardor-Backward Information abbreviated as FBI). Our method firstly trains a ConvNet with two branches which maps an image of a human to both the 2D joint locations and the FBI of bones. These information is further fed into a deep regression network to predict the 3D positions of joints. To support the training, we also develop an annotation user interface and labeled such FBI for around 12K in-the-wild images which are randomly selected from MPII (a public dataset of 2D pose annotation). Our experimental results on the standard benchmarks demonstrate that our approach outperforms state-of-the-art methods both qualitatively and quantitatively.

suggested questions

comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا