ترغب بنشر مسار تعليمي؟ اضغط هنا

Near real-time streaming analysis of big fusion data

229   0   0.0 ( 0 )
 نشر من قبل Ralph Kube
 تاريخ النشر 2021
  مجال البحث فيزياء
والبحث باللغة English




اسأل ChatGPT حول البحث

While experiments on fusion plasmas produce high-dimensional data time series with ever increasing magnitude and velocity, data analysis has been lagging behind this development. For example, many data analysis tasks are often performed in a manual, ad-hoc manner some time after an experiment. In this article we introduce the DELTA framework that facilitates near real-time streaming analysis of big and fast fusion data. By streaming measurement data from fusion experiments to a high-performance compute center, DELTA allows to perform demanding data analysis tasks in between plasma pulses. This article describe the modular and expandable software architecture of DELTA and presents performance benchmarks of its individual components as well as of entire workflows. Our focus is on the streaming analysis of ECEi data measured at KSTAR on NERSCs supercomputers and we routinely achieve data transfer rates of about 500 Megabyte per second. We show that a demanding turbulence analysis workload can be distributed among multiple GPUs and executes in under 5 minutes. We further discuss how DELTA uses modern database systems and container orchestration services to provide web-based real-time data visualization. For the case of ECEi data we demonstrate how data visualizations can be augmented with outputs from machine learning models. By providing session leaders and physics operators results of higher order data analysis using live visualization they may monitor the evolution of a long-pulse discharge in near real-time and may make more informed decision on how to configure the machine for the next shot.



قيم البحث

اقرأ أيضاً

X-ray scattering experiments using Free Electron Lasers (XFELs) are a powerful tool to determine the molecular structure and function of unknown samples (such as COVID-19 viral proteins). XFEL experiments are a challenge to computing in two ways: i) due to the high cost of running XFELs, a fast turnaround time from data acquisition to data analysis is essential to make informed decisions on experimental protocols; ii) data collection rates are growing exponentially, requiring new scalable algorithms. Here we report our experiences analyzing data from two experiments at the Linac Coherent Light Source (LCLS) during September 2020. Raw data were analyzed on NERSCs Cori XC40 system, using the Superfacility paradigm: our workflow automatically moves raw data between LCLS and NERSC, where it is analyzed using the software package CCTBX. We achieved real time data analysis with a turnaround time from data acquisition to full molecular reconstruction in as little as 10 min -- sufficient time for the experiments operators to make informed decisions. By hosting the data analysis on Cori, and by automating LCLS-NERSC interoperability, we achieved a data analysis rate which matches the data acquisition rate. Completing data analysis with 10 mins is a first for XFEL experiments and an important milestone if we are to keep up with data collection trends.
This paper develops an incremental learning algorithm based on quadratic inference function (QIF) to analyze streaming datasets with correlated outcomes such as longitudinal data and clustered data. We propose a renewable QIF (RenewQIF) method within a paradigm of renewable estimation and incremental inference, in which parameter estimates are recursively renewed with current data and summary statistics of historical data, but with no use of any historical subject-level raw data. We compare our renewable estimation method with both offline QIF and offline generalized estimating equations (GEE) approach that process the entire cumulative subject-level data, and show theoretically and numerically that our renewable procedure enjoys statistical and computational efficiency. We also propose an approach to diagnose the homogeneity assumption of regression coefficients via a sequential goodness-of-fit test as a screening procedure on occurrences of abnormal data batches. We implement the proposed methodology by expanding existing Sparks Lambda architecture for the operation of statistical inference and data quality diagnosis. We illustrate the proposed methodology by extensive simulation studies and an analysis of streaming car crash datasets from the National Automotive Sampling System-Crashworthiness Data System (NASS CDS).
We present here Nested_fit, a Bayesian data analysis code developed for investigations of atomic spectra and other physical data. It is based on the nested sampling algorithm with the implementation of an upgraded lawn mower robot method for finding new live points. For a given data set and a chosen model, the program provides the Bayesian evidence, for the comparison of different hypotheses/models, and the different parameter probability distributions. A large database of spectral profiles is already available (Gaussian, Lorentz, Voigt, Log-normal, etc.) and additional ones can easily added. It is written in Fortran, for an optimized parallel computation, and it is accompanied by a Python library for the results visualization.
Electric signals have been recently recorded at the Earths surface with amplitudes appreciably larger than those hitherto reported. Their entropy in natural time is smaller than that, $S_u$, of a ``uniform distribution. The same holds for their entro py upon time-reversal. This behavior, as supported by numerical simulations in fBm time series and in an on-off intermittency model, stems from infinitely ranged long range temporal correlations and hence these signals are probably Seismic Electric Signals (critical dynamics). The entropy fluctuations are found to increase upon approaching bursting, which reminds the behavior identifying sudden cardiac death individuals when analysing their electrocardiograms.
Recently, two novel techniques for the extraction of the phase-shift map (Tomassini {it et.~al.}, Applied Optics {bf 40} 35 (2001)) and the electronic density map estimation (Tomassini P. and Giulietti A., Optics Communication {bf 199}, pp 143-148 (2 001)) have been proposed. In this paper we apply both methods to a sample laser-plasma interferogram obtained with femtoseconds probe pulse, in an experimental setup devoted to laser particle acceleration studies.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا