ترغب بنشر مسار تعليمي؟ اضغط هنا

File-based localization of numerical perturbations in data analysis pipelines

51   0   0.0 ( 0 )
 نشر من قبل Ali Salari
 تاريخ النشر 2020
والبحث باللغة English




اسأل ChatGPT حول البحث

Data analysis pipelines are known to be impacted by computational conditions, presumably due to the creation and propagation of numerical errors. While this process could play a major role in the current reproducibility crisis, the precise causes of such instabilities and the path along which they propagate in pipelines are unclear. We present Spot, a tool to identify which processes in a pipeline create numerical differences when executed in different computational conditions. Spot leverages system-call interception through ReproZip to reconstruct and compare provenance graphs without pipeline instrumentation. By applying Spot to the structural pre-processing pipelines of the Human Connectome Project, we found that linear and non-linear registration are the cause of most numerical instabilities in these pipelines, which confirms previous findings.



قيم البحث

اقرأ أيضاً

This paper aims to create a transition path from file-based IO to streaming-based workflows for scientific applications in an HPC environment. By using the openPMP-api, traditional workflows limited by filesystem bottlenecks can be overcome and flexi bly extended for in situ analysis. The openPMD-api is a library for the description of scientific data according to the Open Standard for Particle-Mesh Data (openPMD). Its approach towards recent challenges posed by hardware heterogeneity lies in the decoupling of data description in domain sciences, such as plasma physics simulations, from concrete implementations in hardware and IO. The streaming backend is provided by the ADIOS2 framework, developed at Oak Ridge National Laboratory. This paper surveys two openPMD-based loosely coupled setups to demonstrate flexible applicability and to evaluate performance. In loose coupling, as opposed to tight coupling, two (or more) applications are executed separately, e.g. in individual MPI contexts, yet cooperate by exchanging data. This way, a streaming-based workflow allows for standalone codes instead of tightly-coupled plugins, using a unified streaming-aware API and leveraging high-speed communication infrastructure available in modern compute clusters for massive data exchange. We determine new challenges in resource allocation and in the need of strategies for a flexible data distribution, demonstrating their influence on efficiency and scaling on the Summit compute system. The presented setups show the potential for a more flexible use of compute resources brought by streaming IO as well as the ability to increase throughput by avoiding filesystem bottlenecks.
Artificial intelligence (AI) classification holds promise as a novel and affordable screening tool for clinical management of ocular diseases. Rural and underserved areas, which suffer from lack of access to experienced ophthalmologists may particula rly benefit from this technology. Quantitative optical coherence tomography angiography (OCTA) imaging provides excellent capability to identify subtle vascular distortions, which are useful for classifying retinovascular diseases. However, application of AI for differentiation and classification of multiple eye diseases is not yet established. In this study, we demonstrate supervised machine learning based multi-task OCTA classification. We sought 1) to differentiate normal from diseased ocular conditions, 2) to differentiate different ocular disease conditions from each other, and 3) to stage the severity of each ocular condition. Quantitative OCTA features, including blood vessel tortuosity (BVT), blood vascular caliber (BVC), vessel perimeter index (VPI), blood vessel density (BVD), foveal avascular zone (FAZ) area (FAZ-A), and FAZ contour irregularity (FAZ-CI) were fully automatically extracted from the OCTA images. A stepwise backward elimination approach was employed to identify sensitive OCTA features and optimal-feature-combinations for the multi-task classification. For proof-of-concept demonstration, diabetic retinopathy (DR) and sickle cell retinopathy (SCR) were used to validate the supervised machine leaning classifier. The presented AI classification methodology is applicable and can be readily extended to other ocular diseases, holding promise to enable a mass-screening platform for clinical deployment and telemedicine.
Central nervous system (CNS) tumors come with the vastly heterogeneous histologic, molecular and radiographic landscape, rendering their precise characterization challenging. The rapidly growing fields of biophysical modeling and radiomics have shown promise in better characterizing the molecular, spatial, and temporal heterogeneity of tumors. Integrative analysis of CNS tumors, including clinically-acquired multi-parametric magnetic resonance imaging (mpMRI) and the inverse problem of calibrating biophysical models to mpMRI data, assists in identifying macroscopic quantifiable tumor patterns of invasion and proliferation, potentially leading to improved (i) detection/segmentation of tumor sub-regions, and (ii) computer-aided diagnostic/prognostic/predictive modeling. This paper presents a summary of (i) biophysical growth modeling and simulation, (ii) inverse problems for model calibration, (iii) their integration with imaging workflows, and (iv) their application on clinically-relevant studies. We anticipate that such quantitative integrative analysis may even be beneficial in a future revision of the World Health Organization (WHO) classification for CNS tumors, ultimately improving patient survival prospects.
Purpose: A conventional 2D UNet convolutional neural network (CNN) architecture may result in ill-defined boundaries in segmentation output. Several studies imposed stronger constraints on each level of UNet to improve the performance of 2D UNet, suc h as SegNet. In this study, we investigated 2D SegNet and a proposed conditional random field insert (CRFI) for zonal prostate segmentation from clinical T2-weighted MRI data. Methods: We introduced a new methodology that combines SegNet and CRFI to improve the accuracy and robustness of the segmentation. CRFI has feedback connections that encourage the data consistency at multiple levels of the feature pyramid. On the encoder side of the SegNet, the CRFI combines the input feature maps and convolution block output based on their spatial local similarity, like a trainable bilateral filter. For all networks, 725 2D images (i.e., 29 MRI cases) were used in training; while, 174 2D images (i.e., 6 cases) were used in testing. Results: The SegNet with CRFI achieved the relatively high Dice coefficients (0.76, 0.84, and 0.89) for the peripheral zone, central zone, and whole gland, respectively. Compared with UNet, the SegNet+CRFIs segmentation has generally higher Dice score and showed the robustness in determining the boundaries of anatomical structures compared with the SegNet or UNet segmentation. The SegNet with a CRFI at the end showed the CRFI can correct the segmentation errors from SegNet output, generating smooth and consistent segmentation for the prostate. Conclusion: UNet based deep neural networks demonstrated in this study can perform zonal prostate segmentation, achieving high Dice coefficients compared with those in the literature. The proposed CRFI method can reduce the fuzzy boundaries that affected the segmentation performance of baseline UNet and SegNet models.
Neural recordings are nonstationary time series, i.e. their properties typically change over time. Identifying specific changes, e.g. those induced by a learning task, can shed light on the underlying neural processes. However, such changes of intere st are often masked by strong unrelated changes, which can be of physiological origin or due to measurement artifacts. We propose a novel algorithm for disentangling such different causes of non-stationarity and in this manner enable better neurophysiological interpretation for a wider set of experimental paradigms. A key ingredient is the repeated application of Stationary Subspace Analysis (SSA) using different temporal scales. The usefulness of our explorative approach is demonstrated in simulations, theory and EEG experiments with 80 Brain-Computer-Interfacing (BCI) subjects.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا