مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Distributed-memory large deformation diffeomorphic 3D image registration

168 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Andreas Mang

تاريخ النشر 2016

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Andreas Mang - Amir Gholami - George Biros

النظم الموزعة والتوازية والحوسبة العنقودية الرؤية الحاسوبية وتمييز الأنماط التحسين والتحكم

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We present a parallel distributed-memory algorithm for large deformation diffeomorphic registration of volumetric images that produces large isochoric deformations (locally volume preserving). Image registration is a key technology in medical image analysis. Our algorithm uses a partial differential equation constrained optimal control formulation. Finding the optimal deformation map requires the solution of a highly nonlinear problem that involves pseudo-differential operators, biharmonic operators, and pure advection operators both forward and back- ward in time. A key issue is the time to solution, which poses the demand for efficient optimization methods as well as an effective utilization of high performance computing resources. To address this problem we use a preconditioned, inexact, Gauss-Newton- Krylov solver. Our algorithm integrates several components: a spectral discretization in space, a semi-Lagrangian formulation in time, analytic adjoints, different regularization functionals (including volume-preserving ones), a spectral preconditioner, a highly optimized distributed Fast Fourier Transform, and a cubic interpolation scheme for the semi-Lagrangian time-stepping. We demonstrate the scalability of our algorithm on images with resolution of up to $1024^3$ on the Maverick and Stampede systems at the Texas Advanced Computing Center (TACC). The critical problem in the medical imaging application domain is strong scaling, that is, solving registration problems of a moderate size of $256^3$---a typical resolution for medical images. We are able to solve the registration problem for images of this size in less than five seconds on 64 x86 nodes of TACCs Maverick system.

قيم البحث

116 - Andreas Mang , Amir Gholami , Christos Davatzikos 2018

With this work, we release CLAIRE, a distributed-memory implementation of an effective solver for constrained large deformation diffeomorphic image registration problems in three dimensions. We consider an optimal control formulation. We invert for a stationary velocity field that parameterizes the deformation map. Our solver is based on a globalized, preconditioned, inexact reduced space Gauss--Newton--Krylov scheme. We exploit state-of-the-art techniques in scientific computing to develop an effective solver that scales to thousands of distributed memory nodes on high-end clusters. We present the formulation, discuss algorithmic features, describe the software package, and introduce an improved preconditioner for the reduced space Hessian to speed up the convergence of our solver. We test registration performance on synthetic and real data. We demonstrate registration accuracy on several neuroimaging datasets. We compare the performance of our scheme against different flavors of the Demons algorithm for diffeomorphic image registration. We study convergence of our preconditioner and our overall algorithm. We report scalability results on state-of-the-art supercomputing platforms. We demonstrate that we can solve registration problems for clinically relevant data sizes in two to four minutes on a standard compute node with 20 cores, attaining excellent data fidelity. With the present work we achieve a speedup of (on average) 5$times$ with a peak performance of up to 17$times$ compared to our former work.

التحسين والتحكم الرؤية الحاسوبية وتمييز الأنماط النظم الموزعة والتوازية والحوسبة العنقودية

Fast GPU 3D Diffeomorphic Image Registration

334 - Malte Brunn , Naveen Himthani , George Biros 2020

3D image registration is one of the most fundamental and computationally expensive operations in medical image analysis. Here, we present a mixed-precision, Gauss--Newton--Krylov solver for diffeomorphic registration of two images. Our work extends t he publicly available CLAIRE library to GPU architectures. Despite the importance of image registration, only a few implementations of large deformation diffeomorphic registration packages support GPUs. Our contributions are new algorithms to significantly reduce the run time of the two main computational kernels in CLAIRE: calculation of derivatives and scattered-data interpolation. We deploy (i) highly-optimized, mixed-precision GPU-kernels for the evaluation of scattered-data interpolation, (ii) replace Fast-Fourier-Transform (FFT)-based first-order derivatives with optimized 8th-order finite differences, and (iii) compare with state-of-the-art CPU and GPU implementations. As a highlight, we demonstrate that we can register $256^3$ clinical images in less than 6 seconds on a single NVIDIA Tesla V100. This amounts to over 20$times$ speed-up over the current version of CLAIRE and over 30$times$ speed-up over existing GPU implementations.

النظم الموزعة والتوازية والحوسبة العنقودية معالجة الصور والفيديو التحسين والتحكم

Multi-Node Multi-GPU Diffeomorphic Image Registration for Large-Scale Imaging Problems

148 - Malte Brunn , Naveen Himthani , George Biros 2020

We present a Gauss-Newton-Krylov solver for large deformation diffeomorphic image registration. We extend the publicly available CLAIRE library to multi-node multi-graphics processing unit (GPUs) systems and introduce novel algorithmic modifications that significantly improve performance. Our contributions comprise ($i$) a new preconditioner for the reduced-space Gauss-Newton Hessian system, ($ii$) a highly-optimized multi-node multi-GPU implementation exploiting device direct communication for the main computational kernels (interpolation, high-order finite difference operators and Fast-Fourier-Transform), and ($iii$) a comparison with state-of-the-art CPU and GPU implementations. We solve a $256^3$-resolution image registration problem in five seconds on a single NVIDIA Tesla V100, with a performance speedup of 70% compared to the state-of-the-art. In our largest run, we register $2048^3$ resolution images (25 B unknowns; approximately 152$times$ larger than the largest problem solved in state-of-the-art GPU implementations) on 64 nodes with 256 GPUs on TACCs Longhorn system.

النظم الموزعة والتوازية والحوسبة العنقودية التحسين والتحكم

Constrained $H^1$-regularization schemes for diffeomorphic image registration

386 - Andreas Mang , George Biros 2015

We propose regularization schemes for deformable registration and efficient algorithms for their numerical approximation. We treat image registration as a variational optimal control problem. The deformation map is parametrized by its velocity. Tikho nov regularization ensures well-posedness. Our scheme augments standard smoothness regularization operators based on $H^1$- and $H^2$-seminorms with a constraint on the divergence of the velocity field, which resembles variational formulations for Stokes incompressible flows. In our formulation, we invert for a stationary velocity field and a mass source map. This allows us to explicitly control the compressibility of the deformation map and by that the determinant of the deformation gradient. We also introduce a new regularization scheme that allows us to control shear. We use a globalized, preconditioned, matrix-free, reduced space (Gauss--)Newton--Krylov scheme for numerical optimization. We exploit variable elimination techniques to reduce the number of unknowns of our system; we only iterate on the reduced space of the velocity field. Our current implementation is limited to the two-dimensional case. The numerical experiments demonstrate that we can control the determinant of the deformation gradient without compromising registration quality. This additional control allows us to avoid oversmoothing of the deformation map. We also demonstrate that we can promote or penalize shear while controlling the determinant of the deformation gradient.

التحسين والتحكم الرؤية الحاسوبية وتمييز الأنماط

Chunkflow: Distributed Hybrid Cloud Processing of Large 3D Images by Convolutional Nets

87 - Jingpeng Wu , William M. Silversmith , Kisuk Lee 2019

It is now common to process volumetric biomedical images using 3D Convolutional Networks (ConvNets). This can be challenging for the teravoxel and even petavoxel images that are being acquired today by light or electron microscopy. Here we introduce chunkflow, a software framework for distributing ConvNet processing over local and cloud GPUs and CPUs. The image volume is divided into overlapping chunks, each chunk is processed by a ConvNet, and the results are blended together to yield the output image. The frontend submits ConvNet tasks to a cloud queue. The tasks are executed by local and cloud GPUs and CPUs. Thanks to the fault-tolerant architecture of Chunkflow, cost can be greatly reduced by utilizing cheap unstable cloud instances. Chunkflow currently supports PyTorch for GPUs and PZnet for CPUs. To illustrate its usage, a large 3D brain image from serial section electron microscopy was processed by a 3D ConvNet with a U-Net style architecture. Chunkflow provides some chunk operations for general use, and the operations can be composed flexibly in a command line interface.

النظم الموزعة والتوازية والحوسبة العنقودية الرؤية الحاسوبية وتمييز الأنماط الخلايا العصبية والإدراك

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الشھباء الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Distributed-memory large deformation diffeomorphic 3D image registration

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً