ترغب بنشر مسار تعليمي؟ اضغط هنا

During the last decades there is a continuing international endeavor in developing realistic space weather prediction tools aiming to forecast the conditions on the Sun and in the interplanetary environment. These efforts have led to the need of deve loping appropriate metrics in order to assess the performance of those tools. Metrics are necessary for validating models, comparing different models and monitoring adjustments or improvements of a certain model over time. In this work, we introduce the Dynamic Time Warping (DTW) as an alternative way to validate models and, in particular, to quantify differences between observed and synthetic (modeled) time series for space weather purposes. We present the advantages and drawbacks of this method as well as applications on WIND observations and EUHFORIA modeled output at L1. We show that DTW is a useful tool that permits the evaluation of both the fast and slow solar wind. Its distinctive characteristic is that it warps sequences in time, aiming to align them with the minimum cost by using dynamic programming. It can be applied in two different ways for the evaluation of modeled solar wind time series. The first way calculates the so-called sequence similarity factor (SSF), a number that provides a quantification of how good the forecast is, compared to a best and a worst case prediction scenarios. The second way quantifies the time and amplitude differences between the points that are best matched between the two sequences. As a result, it can serve as a hybrid metric between continuous measurements (such as, e.g., the correlation coefficient) and point-by-point comparisons. We conclude that DTW is a promising technique for the assessment of solar wind profiles offering functions that other metrics do not, so that it can give at once the most complete evaluation profile of a model.
140 - Nachiket H. Gokhale 2021
We explore the application of a Convolutional Neural Network (CNN) to image the shear modulus field of an almost incompressible, isotropic, linear elastic medium in plane strain using displacement or strain field data. This problem is important in me dicine because the shear modulus of suspicious and potentially cancerous growths in soft tissue is elevated by about an order of magnitude as compared to the background of normal tissue. Imaging the shear modulus field therefore can lead to high-contrast medical images. Our imaging problem is: Given a displacement or strain field (or its components), predict the corresponding shear modulus field. Our CNN is trained using 6000 training examples consisting of a displacement or strain field and a corresponding shear modulus field. We observe encouraging results which warrant further research and show the promise of this methodology.
Optical emission spectroscopy from a small-volume, 5 uL, atmospheric pressure RF-driven helium plasma was used in conjunction with Partial Least Squares Discriminant Analysis (PLS-DA) for the detection of trace concentrations of methane gas. A limit of detection of 1 ppm was obtained and sample concentrations up to 100 ppm CH4 were classified using a nine-category model. A range of algorithm enhancements were investigated including regularization, simple data segmentation and subset selection, VIP feature selection and wavelength variable compression in order to address the high dimensionality and collinearity of spectral emission data. These approaches showed the potential for significant reduction in the number of wavelength variables and the spectral resolution/bandwidth. Wavelength variable compression exhibited reliable predictive performance, with accuracy values > 97%, under more challenging multi-session train - test scenarios. Simple modelling of plasma electron energy distribution functions highlights the complex cross-sensitivities between the target methane, its dissociation products and atmospheric impurities and their impact on excitation and emission.
One of the outstanding analytical problems in X-ray single particle imaging (SPI) is the classification of structural heterogeneity, which is especially difficult given the low signal-to-noise ratios of individual patterns and that even identical obj ects can yield patterns that vary greatly when orientation is taken into consideration. We propose two methods which explicitly account for this orientation-induced variation and can robustly determine the structural landscape of a sample ensemble. The first, termed common-line principal component analysis (PCA) provides a rough classification which is essentially parameter-free and can be run automatically on any SPI dataset. The second method, utilizing variation auto-encoders (VAEs) can generate 3D structures of the objects at any point in the structural landscape. We implement both these methods in combination with the noise-tolerant expand-maximize-compress (EMC) algorithm and demonstrate its utility by applying it to an experimental dataset from gold nanoparticles with only a few thousand photons per pattern and recover both discrete structural classes as well as continuous deformations. These developments diverge from previous approaches of extracting reproducible subsets of patterns from a dataset and open up the possibility to move beyond studying homogeneous sample sets and study open questions on topics such as nanocrystal growth and dynamics as well as phase transitions which have not been externally triggered.
The REST-for-Physics (Rare Event Searches Toolkit for Physics) framework is a ROOT-based solution providing the means to process and analyze experimental or Monte Carlo event data. Special care has been taken on the traceability of the code and the v alidation of the results produced within the framework, together with the connectivity between code and data stored registered through specific version metadata members. The framework development was originally motivated to cover the needs at Rare Event Searches experiments (experiments looking for phenomena having extremely low occurrence probability like dark matter or neutrino interactions or rare nuclear decays), and its components naturally implement tools to address the challenges in these kinds of experiments; the integration of a detector physics response, the implementation of signal processing routines, or topological algorithms for physical event identification are some examples. Despite this specialization, the framework was conceived thinking in scalability, and other event-oriented applications could benefit from the data processing routines and/or metadata description implemented in REST, being the generic framework tools completely decoupled from dedicated libraries. REST-for-Physics is a consolidated piece of software already serving the needs of different physics experiments - using gaseous Time Projection Chambers (TPCs) as detection technology - for background data analysis and detector characterization, as well as generic detector R&D. Even though REST has been exploited mainly with gaseous TPCs, the code could be easily applied or adapted to other detection technologies. We present in this work an overview of REST-for-Physics, providing a broad perspective to the infrastructure and organization of the project as a whole. The framework and its different components will be described in the text.
Methods for time series prediction and classification of gene regulatory networks (GRNs) from gene expression data have been treated separately so far. The recent emergence of attention-based recurrent neural networks (RNN) models boosted the interpr etability of RNN parameters, making them appealing for the understanding of gene interactions. In this work, we generated synthetic time series gene expression data from a range of archetypal GRNs and we relied on a dual attention RNN to predict the gene temporal dynamics. We show that the prediction is extremely accurate for GRNs with different architectures. Next, we focused on the attention mechanism of the RNN and, using tools from graph theory, we found that its graph properties allow to hierarchically distinguish different architectures of the GRN. We show that the GRNs respond differently to the addition of noise in the prediction by the RNN and we relate the noise response to the analysis of the attention mechanism. In conclusion, this work provides a a way to understand and exploit the attention mechanism of RNN and it paves the way to RNN-based methods for time series prediction and inference of GRNs from gene expression data.
We introduce a methodology to visualize the limit order book (LOB) using a particle physics lens. Open-source data-analysis tool ROOT, developed by CERN, is used to reconstruct and visualize futures markets. Message-based data is used, rather than sn apshots, as it offers numerous visualization advantages. The visualization method can include multiple variables and markets simultaneously and is not necessarily time dependent. Stakeholders can use it to visualize high-velocity data to gain a better understanding of markets or effectively monitor markets. In addition, the method is easily adjustable to user specifications to examine various LOB research topics, thereby complementing existing methods.
Out of the numerous hazards posing a threat to sustainable environmental conditions in the 21st century, only a few have a graver impact than air pollution. Its importance in determining the health and living standards in urban settings is only expec ted to increase with time. Various factors ranging from emissions from traffic and power plants, household emissions, natural causes are known to be primary causal agents or influencers behind rising air pollution levels. However, the lack of large scale data involving the major factors has hindered the research on the causes and relations governing the variability of the different air pollutants. Through this work, we introduce a large scale city-wise dataset for exploring the relationships among these agents over a long period of time. We analyze and explore the dataset to bring out inferences which we can derive by modeling the data. Also, we provide a set of benchmarks for the problem of estimating or forecasting pollutant levels with a set of diverse models and methodologies. Through our paper, we seek to provide a ground base for further research into this domain that will demand critical attention of ours in the near future.
Recent advances in (scanning) transmission electron microscopy have enabled routine generation of large volumes of high-veracity structural data on 2D and 3D materials, naturally offering the challenge of using these as starting inputs for atomistic simulations. In this fashion, theory will address experimentally emerging structures, as opposed to the full range of theoretically possible atomic configurations. However, this challenge is highly non-trivial due to the extreme disparity between intrinsic time scales accessible to modern simulations and microscopy, as well as latencies of microscopy and simulations per se. Addressing this issue requires as a first step bridging the instrumental data flow and physics-based simulation environment, to enable the selection of regions of interest and exploring them using physical simulations. Here we report the development of the machine learning workflow that directly bridges the instrument data stream into Python-based molecular dynamics and density functional theory environments using pre-trained neural networks to convert imaging data to physical descriptors. The pathways to ensure the structural stability and compensate for the observational biases universally present in the data are identified in the workflow. This approach is used for a graphene system to reconstruct optimized geometry and simulate temperature-dependent dynamics including adsorption of Cr as an ad-atom and graphene healing effects. However, it is universal and can be used for other material systems.
Automatized object identification and feature analysis of experimental image data are indispensable for data-driven material science; deep-learning-based segmentation algorithms have been shown to be a promising technique to achieve this goal. Howeve r, acquiring high-resolution experimental images and assigning labels in order to train such algorithms is challenging and costly in terms of both time and labor. In the present work, we apply synthetic images, which resemble the experimental image data in terms of geometrical and visual features, to train state-of-art deep learning-based Mask R-CNN algorithms to segment vanadium pentoxide (V2O5) nanowires, a canonical cathode material, within optical intensity-based images from spectromicroscopy. The performance evaluation demonstrates that even though the deep learning model is trained on pure synthetically generated structures, it can segment real optical intensity-based spectromicroscopy images of complex V2O5 nanowire structures in overlapped particle networks, thus providing reliable statistical information. The model can further be used to segment nanowires in scanning electron microscopy (SEM) images, which are fundamentally different from the training dataset known to the model. The proposed methodology of using a purely synthetic dataset to train the deep learning model can be extended to any optical intensity-based images of variable particle morphology, extent of agglomeration, material class, and beyond.

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا