Deep learning-based statistical noise reduction for multidimensional spectral data

298 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Younsik Kim

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية فيزياء

والبحث باللغة English

تأليف Younsik Kim - Dongjin Oh - Soonsang Huh

التعلم الآلي تحليل البيانات والإحصاءات والاحتمال

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

In spectroscopic experiments, data acquisition in multi-dimensional phase space may require long acquisition time, owing to the large phase space volume to be covered. In such case, the limited time available for data acquisition can be a serious constraint for experiments in which multidimensional spectral data are acquired. Here, taking angle-resolved photoemission spectroscopy (ARPES) as an example, we demonstrate a denoising method that utilizes deep learning as an intelligent way to overcome the constraint. With readily available ARPES data and random generation of training data set, we successfully trained the denoising neural network without overfitting. The denoising neural network can remove the noise in the data while preserving its intrinsic information. We show that the denoising neural network allows us to perform similar level of second-derivative and line shape analysis on data taken with two orders of magnitude less acquisition time. The importance of our method lies in its applicability to any multidimensional spectral data that are susceptible to statistical noise.

قيم البحث

96 - Rich Ormiston , Tri Nguyen , Michael Coughlin 2020

With the advent of gravitational wave astronomy, techniques to extend the reach of gravitational wave detectors are desired. In addition to the stellar-mass black hole and neutron star mergers already detected, many more are below the surface of the noise, available for detection if the noise is reduced enough. Our method (DeepClean) applies machine learning algorithms to gravitational wave detector data and data from on-site sensors monitoring the instrument to reduce the noise in the time-series due to instrumental artifacts and environmental contamination. This framework is generic enough to subtract linear, non-linear, and non-stationary coupling mechanisms. It may also provide handles in learning about the mechanisms which are not currently understood to be limiting detector sensitivities. The robustness of the noise reduction technique in its ability to efficiently remove noise with no unintended effects on gravitational-wave signals is also addressed through software signal injection and parameter estimation of the recovered signal. It is shown that the optimal SNR ratio of the injected signal is enhanced by $sim 21.6%$ and the recovered parameters are consistent with the injected set. We present the performance of this algorithm on linear and non-linear noise sources and discuss its impact on astrophysical searches by gravitational wave detectors.

الأجهزة والأساليب للزيئات الفيزياء الفلكية النسبية العامة وهدية الكونيات الكم تحليل البيانات والإحصاءات والاحتمال

Data-Driven Wind Turbine Wake Modeling via Probabilistic Machine Learning

484 - S. Ashwin Renganathan , Romit Maulik , Stefano Letizia 2021

Wind farm design primarily depends on the variability of the wind turbine wake flows to the atmospheric wind conditions, and the interaction between wakes. Physics-based models that capture the wake flow-field with high-fidelity are computationally v ery expensive to perform layout optimization of wind farms, and, thus, data-driven reduced order models can represent an efficient alternative for simulating wind farms. In this work, we use real-world light detection and ranging (LiDAR) measurements of wind-turbine wakes to construct predictive surrogate models using machine learning. Specifically, we first demonstrate the use of deep autoencoders to find a low-dimensional emph{latent} space that gives a computationally tractable approximation of the wake LiDAR measurements. Then, we learn the mapping between the parameter space and the (latent space) wake flow-fields using a deep neural network. Additionally, we also demonstrate the use of a probabilistic machine learning technique, namely, Gaussian process modeling, to learn the parameter-space-latent-space mapping in addition to the epistemic and aleatoric uncertainty in the data. Finally, to cope with training large datasets, we demonstrate the use of variational Gaussian process models that provide a tractable alternative to the conventional Gaussian process models for large datasets. Furthermore, we introduce the use of active learning to adaptively build and improve a conventional Gaussian process model predictive capability. Overall, we find that our approach provides accurate approximations of the wind-turbine wake flow field that can be queried at an orders-of-magnitude cheaper cost than those generated with high-fidelity physics-based simulations.

التعلم الآلي تحليل البيانات والإحصاءات والاحتمال

A data-based comparative review and AI-driven symbolic model for longitudinal dispersion coefficient in natural streams

78 - Yifeng Zhao , Zicheng Liu , Pei Zhang 2021

A better understanding of dispersion in natural streams requires knowledge of longitudinal dispersion coefficient(LDC). Various methods have been proposed for predictions of LDC. Those studies can be grouped into three types: analytical, statistical and ML-driven researches(Implicit and explicit). However, a comprehensive evaluation of them is still lacking. In this paper, we first present an in-depth analysis of those methods and find out their defects. This is carried out on an extensive database composed of 660 samples of hydraulic and channel properties worldwide. The reliability and representativeness of utilized data are enhanced through the deployment of the Subset Selection of Maximum Dissimilarity(SSMD) for testing set selection and the Inter Quartile Range(IQR) for removal of the outlier. The evaluation reveals the rank of those methods as: ML-driven method > the statistical method > the analytical method. Whereas implicit ML-driven methods are black-boxes in nature, explicit ML-driven methods have more potential in prediction of LDC. Besides, overfitting is a universal problem in existing models. Those models also suffer from a fixed parameter combination. To establish an interpretable model for LDC prediction with higher performance, we then design a novel symbolic regression method called evolutionary symbolic regression network(ESRN). It is a combination of genetic algorithms and neural networks. Strategies are introduced to avoid overfitting and explore more parameter combinations. Results show that the ESRN model has superiorities over other existing symbolic models in performance. The proposed model is suitable for practical engineering problems due to its advantage in low requirement of parameters (only w and U* are required). It can provide convincing solutions for situations where the field test cannot be carried out or limited field information can be obtained.

التعلم الآلي تحليل البيانات والإحصاءات والاحتمال

Scalable multicomponent spectral analysis for high-throughput data annotation

122 - Rui Patrick Xian , Ralph Ernstorfer , Philipp Michael Pelz 2021

Orchestrating parametric fitting of multicomponent spectra at scale is an essential yet underappreciated task in high-throughput quantification of materials and chemical composition. To automate the annotation process for spectroscopic and diffractio n data collected in counts of hundreds to thousands, we present a systematic approach compatible with high-performance computing infrastructures using the MapReduce model and task-based parallelization. We implement the approach in software and demonstrate linear computational scaling with respect to spectral components using multidimensional experimental materials characterization datasets from photoemission spectroscopy and powder electron diffraction as benchmarks. Our approach enables efficient generation of high-quality data annotation and online spectral analysis and is applicable to a variety of analytical techniques in materials science and chemistry as a building block for closed-loop experimental systems.

علم المواد تحليل البيانات والإحصاءات والاحتمال

Stochastic Variance Reduction for Deep Q-learning

406 - Wei-Ye Zhao , Xi-Ya Guan , Yang Liu 2019

Recent advances in deep reinforcement learning have achieved human-level performance on a variety of real-world applications. However, the current algorithms still suffer from poor gradient estimation with excessive variance, resulting in unstable tr aining and poor sample efficiency. In our paper, we proposed an innovative optimization strategy by utilizing stochastic variance reduced gradient (SVRG) techniques. With extensive experiments on Atari domain, our method outperforms the deep q-learning baselines on 18 out of 20 games.

التعلم الآلي التعلم الالي